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Foreword 


The  Indian  Academy  of  Sciences  launched  Resonance  as  a  monthly  journal 
devoted  to  science  education  in  January  1996.  Resonance  is  aimed  largely  at  under¬ 
graduate  students  and  teachers  of  science,  though  material  of  interest  to  somewhat 
younger  students  is  also  included.  Each  issue  contains  papers  that  span  a  wide  area 
of  science  and  mathematics,  in  various  formats.  Some  are  individual  general  arti¬ 
cles,  others  consist  of  series  with  several  parts.  An  effort  is  made  to  ensure  good 
expository  quality  in  all  of  them. 

“Echoes  from  Resonance”  is  a  series  of  books  bom  out  of  Resonance ,  by  putting 
together  in  a  coherent  manner  a  collection  of  articles  (both  series  and  single  pieces) 
taken  from  Resonance ,  all  written  around  a  common  theme.  Typically,  the  individual 
articles  would  have  appeared  quite  independently  at  different  times.  These  collec¬ 
tions  should  prove  useful  to  a  reader  who  is  keen  to  learn  about  a  specific  subject, 
with  accounts  given  by  different  authors  from  different  perspectives,  but  all  in  an 
expository  manner.  We  hope  these  volumes  would  be  useful  for  students  and  teachers 
alike,  and  that  they  will  complement  the  structure  of  individual  issues  of  Resonance 
which  cover  different  areas  of  science  and  mathematics  in  a  balanced  manner. 


N.  Mukunda 


Preface 


Number  theory  has  been  a  subject  of  study  by  mathematicians  from  the  most  ancient 
of  times.  In  the  Plimpton  322  clay  artefact,  excavated  from  the  ruins  of  ancient  Baby¬ 
lon,  one  finds  a  systematic  listing  of  a  large  number  of  Pythagorean  triples — triples 
(a,  b,  c )  of  positive  integers  such  that  a2  +  b2  =  c2;  they  appear  to  be  listed  in  order 
of  increasing  c/a  ratio.  (One  sees  in  the  table  the  beginnings  of  trigonometry.)  The 
Greeks  had  a  deep  interest  in  number  theory.  Euclid’s  great  text,  The  Elements , 
generally  considered  as  a  book  only  on  Geometry,  actually  contains  a  fair  amount 
of  number  theory  too;  in  particular  it  contains  the  proofs  of  two  gems  discovered 
by  the  Greeks-the  irrationality  of  V2  and  the  infinitude  of  the  primes.  It  also  con¬ 
tains  a  description  of  the  algorithm  now  known  as  the  Euclidean  algorithm,  which 
computes  the  greatest  common  divisor  of  two  given  numbers.  In  ancient  India  too 
there  was  much  interest  in  number  theory,  particularly  in  Diophantine  equations;  for 
instance,  in  the  linear  two-variable  equation  ax  +  by  =  c,  where  a ,  b,  c  are  given 
integers,  and  in  the  equation  later  to  be  known  as  the  Pell  equation  ( x 2  -  Ny2  =  1, 
where  N  is  a  given  positive  integer).  Building  on  the  work  of  Brahmagupta  (6th  cen¬ 
tury)  Bhaskara  II  (12th  century)  gave  a  completely  general  way  of  solving  the  latter 
equation. 

In  this  book  we  offer  the  reader  some  articles  in  number  theory  that  appeared 
in  Resonance  over  the  years  1996-2001.  Traditionally,  number  theory  begins  with 
a  study  of  congruences  (Wilson’s  and  Fermat’s  theorems,  the  Chinese  remainder 
theorem,  quadratic  residues,  primitive  roots,  . . . ),  then  proceeds  to  a  study  of  prime 
numbers  (the  infinitude  of  various  classes  of  primes,  divergence  of  the  sum  £  1  /p 
taken  over  all  primes,  . . . )  and  later  to  a  study  of  Diophantine  equations  (solution 
of  equations  such  as  x2  +  y2  =  z2,  ax  +  by  =  c,  where  a ,  b,  c  are  given  integers, 
Pell’s  equation  . . . ).  The  last  two  topics  (prime  numbers,  Diophantine  equations) 
are  distinguished  by  the  extraordinary  diversity  in  terms  of  level  of  difficulty,  of 
the  problems  they  offer  to  the  students.  There  is  something  in  number  theory  for 
practically  everyone! 

The  articles  included  within  form  a  varied  lot  with  the  first  half  (articles  1  to  8) 
being  of  an  elementary  nature.  We  begin  with  a  short  essay  on  the  axiomatic  approach 
in  modem  mathematics:  on  how  conventions  sometimes  need  to  be  followed  for  the 
sake  of  preserving  uniformity  and  maintaining  mathematical  harmony.  The  next  two 
of  the  articles  deal  with  elementary  problems:  “Find  four  positive  integers  such  that 
the  sum  of  any  two  is  a  square”,  and  Bachet’s  problem  (“100  kg  with  five  stones”), 
solved  using  generating  functions.  There  is  a  piece  on  mathematical  induction,  one 
of  the  very  trustworthy  and  important  techniques  in  the  toolkit  of  any  mathematician, 
particularly  the  number  theorist  and  combinatorist.  The  following  article  describes 
Euler’s  proof  of  the  infinitude  of  primes,  which  establishes  rather  more  than  Eulid’s 
well-known  proof  of  the  same  result.  Then  there  is  a  short  piece  on  Fermat’s  two- 
square  theorem,  elaborating  on  a  “crisp  and  elegant  proof”  by  Zagier  of  the  theorem 
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that  a  prime  of  the  form  1  (mod  4)  is  a  sum  of  two  squares.  The  article  also  suggests 
an  algorithm  approach  towards  proving  the  theorem.  In  the  following  article,  “Fer¬ 
mat’s  Two  Squares  Theorem  Revisited”,  Bhaskar  Bagchi  proves  the  correctness  of 
the  algorithm.  Following  this  is  a  report  on  recent  work  done  on  the  factorization  of 
Fermat  numbers  defined  by  Fn  =  22n  +  \.  (Fermat  had  conjectured,  perhaps  rather 
rashly,  that  the  numbers  Fn  are  all  prime.  Now  it  appears  that  for  n  >  4  they  may 
never  be  prime!) 

Articles  9-16  are  of  more  substantive  nature  beginning  with  a  two-part  article 
(articles  9  and  10)  on  the  class  number  problem  (“Binary  Quadratic  Forms”  and 
“Algebraic  Number  Theory”),  a  topic  dealt  with  for  the  first  time  and  in  considerable 
detail  by  Gauss  in  his  path-breaking  book  Disquisitiones  Arithemeticae .  The  two  arti¬ 
cles  which  follow — “Roots  are  not  contained  in  cyclotomic  fields”  and  “Die  Ganzen 
Zahlen  hat  Gott  gemacht,  alles  andere  ist  Menschenwerk” — deal  with  cyclotomic 
polynomials  and  cyclotomic  fields  giving  interesting  applications  of  ideas  introduced 
in  the  previous  two-part  article.  A  proof  of  a  beautiful  relation  between  prime  rep¬ 
resenting  quadratic  equations  and  class  numbers  is  the  subject  of  the  next  article. 
We  then  have  an  article  on  congruent  numbers,  dealing  with  a  problem  dating  from 
ancient  times  but  which  has  intimate  connections  with  a  very  modem  topic  -  that  of 
elliptic  curves.  This  is  followed  by  an  expository  article  on  one  of  the  great  math¬ 
ematical  achievements  of  the  20th  century — the  proof  of  “Fermat’s  Last  Theorem” 
by  Andrew  Wiles.  To  top  off  the  collection  we  have  brief  survey  of  some  currently 
unsolved  problems  in  number  theory.  (In  passing,  we  remark  briefly  that  references 
to  Fermat  appear  surprisingly  many  times  in  this  collection!) 

We  hope  that  the  reader  will  enjoy  this  varied  collection. 


Shailesh  Shirali 
C  S  Yogananda 
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On  Provability  versus  Consistency  in 
Elementary  Mathematics 


Shailesh  A  Shirali 

A  reader  asks,  “  Why  is  1  not  listed  as  a  prime?  After  all,  does  it  not  satisfy  the 
stated  criteria  for  primality?”  This  chapter  is  written  in  response  to  this  question. 

The  layperson  usually  thinks  that  mathematics  deals  with  absolute  truths,  and  indeed 
this  was  how  mathematics  was  viewed  during  earlier  centuries.  However,  ever  since 
the  epochal  discoveries  of  Bolyai,  Lobachevsky  and  Riemann  that  there  can  be  geo¬ 
metries  (note  the  plural)  other  than  the  one  presented  in  Euclid’s  text  The  Elements , 
this  implicit  notion  had  to  be  dropped.  Even  the  notion  that  everything  in  mathe¬ 
matics  is  provably  true  or  provably  false  had  to  be  abandoned,  after  the  astonishing 
results  obtained  by  Godel  in  1930.  Alongside  this  development,  mathematics  has 
seen  a  pioneering  and  extremely  productive  method:  the  axiomatic  method,  in  which 
new  areas  of  mathematics  get  created  merely  by  defining  suitable  sets  of  axioms.  As  a 
result,  the  accent  in  mathematics  has  to  some  extent  shifted  to  the  study  of  axiomatic 
systems,  and  the  essential  question  in  such  cases  has  become  one  of  consistency  and 
richness  of  the  axiom  system  rather  than  its  intrinsic  truth  or  falsity.  Much  of  modern 
algebra,  starting  with  group  theory,  the  theory  of  fields  and  rings  and  vector  spaces 
and  so  on  can  be  viewed  in  this  light.  Loosely  speaking,  one  might  say  that  in  the 
modern  mathematical  paradigm,  true  is  roughly  equivalent  to  consistent,  whil &  false 
is  equivalent  to  self  -contradictory^ . 

Here  are  some  instances  to  illustrate  the  theme  of  consistency  as  opposed  to  abso¬ 
lute  truth.  In  school  arithmetic,  one  encounters  the  question,  “Why  is  -1  x  -1  =  1?” 
Many  ‘proofs’  are  offered,  but  the  plain  fact  is  that  the  relation  is  a  convention,  not 


1  It  is  an  interesting  commentary  on  the  psychology  of  modern  mathematicians  that,  when  pressed,  most 
of  them  will  readily  say  that  there  is  no  such  thing  as  absolute  truth  in  mathematics,  and  that  a  mathemat¬ 
ical  proposition  is  true  or  false  only  with  reference  to  a  particular  axiomatic  system.  But  amongst  them¬ 
selves  most  mathematicians  ‘know’  that  what  they  deal  with  does  indeed  refer  to  something  ‘concrete’, 
‘real’  and  ‘absolute’ ! 
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an  absolute  truth,  and  therefore  there  is  no  question  of  proving  itz.  One  adopts  it 
because  of  its  implication  for  the  law  of  distributivity  of  multiplication  over  addition 
(LDMA  for  short),  according  to  which  a(b  +  c)  =  ab  +  ac  for  all  a ,  b,  c.  The  LDMA 
is  too  valuable  an  axiom  to  lose!  Here  is  roughly  how  it  happens.  Starting  with  N  the 
set  of  positive  integers,  with  x  and  +  defined  on  N  in  the  usual  manner,  we  enlarge 
the  set  by  including  0  and  imposing  the  following  rules: 

a~hO  —  0~j~a  =  a,  ax0  =  0xa  =  0. 

Note  that  the  two  statements  are  consistent  with  one  another  because  of  the  LDMA. 
For  example,  2x3  =  2x(3  +  0)  =  2x3  +  2x0,  so  we  must  have  2x0  =  0.  Next, 
one  constructs  the  negative  numbers  via  the  rule  a  +  (- a )  =  0.  To  do  addition  we 
call  upon  commutativity  and  associativity.  For  instance  we  have: 

(-2)  +  (-3)  +  (2  +  3)  =  (-2)  +  2  +  (-3)  +  3  =  0  +  0  =  0. 

Therefore,  (-2)  +  (-3)  +  5  =  0  and  (-2)  +  (-3)  =  -5. 

Finally,  multiplication  is  taken  up,  and  here  one  invokes  distributivity.  We  find  that 
we  are  forced  to  adopt  the  convention  that  —1  x  1  =  —  1  and  —1  x  —  1  =  1: 

0  =  0  x  1  =  {(1  +  (-1)}  x  1  =  {1  x  1}  +  {(-1)  x  1)  =  1  +  {(-1)  x  1}, 
therefore,  (—  1)  x  1  =  -1;  and, 

0  =  { 1  +  (-1)}  x  (-1)  =  {1  x  (-1)}  +  {(-1)  x  (-1)}  =  -1  +  {(-1)  x  (-1)}, 

therefore,  (-1)  x  (-1)  =  +1.  The  point  is  that  we  need  these  relations  if  we  are  to 
preserve  the  LDMA,  which  we  cannot  afford  to  lose.  The  consistency  of  the  system 
must  be  preserved  at  all  cost  . 

Here  is  another  question,  also  asked  at  the  school  level:  Why  is  a0  =  1  for  all 
a  >  0?  We  proceed  to  resolve  this  in  a  similar  vein.  Let  x,  y  G  N;  then  ax+y  =  ax  x  ay 
and  ax~y  =  ax /ay  when  x  >  y.  These  follow  from  the  very  meaning  of  an  when  n 
is  a  positive  integer.  What  do  we  do  with  a°l  If  we  wish  to  have  a  system  of  algebra 
that  is  consistent  and  easy  to  work  with,  then  we  need  to  adopt  the  convention  that 
a0  =  1 .  There  is  nothing  absolute  about  this.  Rather,  we  choose  to  give  a0  a  meaning 
that  makes  it  easy  to  deal  with.  In  short,  we  make  a0  a  well-behaved  object.  (Note 
that  0°  cannot  be  given  any  consistent  meaning,  nor  can  0/0;  that  is,  it  is  not  possible 
to  make  these  objects  well-behaved.) 

Finally  we  take  up  the  question:  “A  1  a  prime?”  We  recall  the  fundamental  theo¬ 
rem  of  arithmetic  (FTA):  Every  integer  N  >  1  can  be  expressed  in  just  one  way  as 
a  product  of  primes,  except  possibly  for  the  order  of  occurrence  of  the  primes.  If  1 
were  included  in  the  set  of  primes  P,  then  the  fact  that  \n  =  1  for  all  integers  n  would 
require  us  to  rephrase  the  FTA  by  adding  the  clause  “...  except  that  1  may  occur  to 


2  Here  is  a  particularly  preposterous  proof  which  I  encountered  a  few  years  back:  the  parabola  y  =  x2 3  is 
symmetric  in  the  y-axis,  therefore  minus  times  minus  equals  plus! 

3  Sacrificing  the  LDMA  would  mean  that  we  lose  the  ring  structure  of  Z. 


On  Provability  versus  Consistency  in  Elementary  Mathematics  3 


any  arbitrary  power.”  We  would  end  up  labelling  1  as  a  special  prime,  to  be  excluded 
from  most  of  the  interesting  theorems  about  prime  numbers.  Indeed,  what  would  in 
all  likelihood  happen  is  that  theorems  about  primes  would  end  up  being  phrased  in 
terms  of  the  set  p'  =  p\u)  .  Thus  giving  1  membership  in  P  proves  to  be  a  nuisance, 
and  it  is  simpler  to  keep  it  out  right  at  the  start. 

The  matter  can  be  considered  from  another  viewpoint.  Let  Z  denote  the  set  of 
integers,  and  consider  the  set  of  complex  numbers  of  the  form  a  -l-  bi,  where  a,  b  e  Z, 
and  i  =  \/— I.  These  are  the  Gaussian  integers  first  studied  in  detail  by  Gauss,  and 
the  set  of  such  numbers  is  denoted  by  Z (/).  (Note  that  Z  is  a  subset  of  Z (/).)  Now 
in  Z,  the  only  elements  that  possess  multiplicative  inverses  are  ±1  (that  is,  their 
reciprocals  lie  within  the  same  set);  these  are  the  units  of  Z.  In  Z (/),  the  set  of  units 
turns  out  to  be  { ±  1 ,  ±i ) .  (The  reader  is  invited  to  verify  that  there  are  no  other  units  in 
Z (/).)  Arithmetic  can  be  done  in  Z (/)  just  as  it  is  in  Z;  for  instance,  we  can  factorize 
numbers: 


9  +  7/  =  (2  +  3/)(3  -  /),  13  =  (2  +  30(2  -  30, .... 

Observe  that  13,  which  is  prime  in  Z,  loses  its  primality  status  in  Z (/). 

We  declare  a  number  z  e  Z (/)  to  be  prime  if  £  is  not  a  unit  and  if  in  every  factor¬ 
ization  z  =  «v,  with  m,  v  £  Z (0,  either  u  or  v  is  a  unit4 5.  The  reader  is  invited  to  verify 
that  3,  7  and  2  +  3/  are  Gaussian  primes,  whereas  2,  5  and  13  are  composite  (because 
2  =  (1  +  0(1  -  0,  5  =  (1  +  20(1  -  20,  etc.).  We  now  have  the  result:  every  number 
in  Z(0.  not  0  or  a  unit,  can  be  written  as  a  product  of  Gaussian  primes;  moreover, 
there  is  essentially  only  one  way  of  doing  this^ .  That  is,  we  have  an  analogue  of  the 
FTA  for  the  Gaussian  integers,  provided  that  the  units  are  not  considered  as  primes. 

Other  such  number  systems  can  be  constructed.  Indeed,  once  one  grasps  the  idea, 
such  systems  seem  to  be  available  in  abundance  and  can  be  spotted  in  many  set¬ 
tings.  For  instance,  consider  the  set  Z( V2)  whose  elements  are  numbers  of  the  form 
a  +  bV 2  where  a,b  e  Z.  This  system  presents  itself  quite  naturally  when  one  tries 
to  solve  the  equation  x 2  -  2 y2  =  ±1  in  integers.  A  striking  fact  about  Z(\/2)  is 
that  it  has  infinitely  many  units.  (The  reader  is  invited  to  show  this.  Hint:  Show  that 
\fl  -  1  and  its  integral  powers  are  units;  (harder)  show  that  these  are  the  only  units.) 
What  are  the  primes  of  Z(V2)?  It  turns  out  that  V2  is  prime,  as  are  the  numbers 
3,  5  and  1 1,  but  not  7,  because  7  =  (3  -  V2)  x  (3  -1-  V2),  nor  17,  because 
17  =  (5  -  2v/2)  x  (5  -I-  2\/2).  It  is  an  interesting  exercise  to  classify  the  primes 
of  Z (/)  and  Z(V2).  Is  there  an  analogue  of  the  FTA  for  Z(V2)?  The  answer  is 
“yes”,  though  it  is  hard  work  to  prove  it.  However  there  are  numerous  number  sys¬ 
tems  that  closely  resemble  Z (/)  and  Z(V2)  but  which  do  not  have  the  FTA  property. 
An  example  is  Z(  VT0):  it  can  be  shown  that  2,  3, 4  -  VT0  and  4  +  VT0  are  primes  in 

Z(VT0),  yet  _ 

6  =  2x3  =  (4-  VTO)  x  (4  +  VT5), 


4  Since  this  article  deals  with  terminology,  it  should  be  pointed  out  that  what  we  refer  to  as  ‘prime’  here 
is  usually  called  ‘irreducible’  in  the  standard  texts.  In  the  standard  definition,  p  is  prime  if  we  have  the 
implication  p\ab  =>  p\a  or  p\b.  In  the  class  of  rings  known  as  UFD’s  the  two  notions  coincide.  Examples 
of  UFD’s  are  Z,  Z (/)  and  Z(  y/2).  However  Z(  vTO)  is  not  a  UFD. 

5  Units  may  enter  the  picture,  hence  the  use  of  the  words  ‘essentially  only  one  way’. 
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providing  a  counter  example  to  the  FTA.  As  the  reader  will  have  noted  by  now,  the 
word  prime  no  longer  carries  a  fixed  meaning;  it  acquires  meaning  only  with  refer¬ 
ence  to  a  particular  context^.  The  interested  reader  can  consult  the  well-known  text 
by  G  H  Hardy  and  E  M  Wright  (An  Introduction  to  the  Theory  of  Numbers,  Chapters 
XIV  and  XV)  for  further  details. 

Here  is  another  example  of  axiomatic  generalization.  A  rational  number  can  be 
thought  of  as  a  root  of  the  equation  mx  +  n  =  0,  with  m,  n  e  Z,  m  ^  0;  here  m  -  1 
gives  us  the  integers  —  we  call  these  the  rational  integers.  Generalizing,  we  define  an 

algebraic  number  as  a  root  of  the  polynomial  equation  axn  +  bxn~ 1  +  cxn~2  H - =  0 

with  a,b,c, . . .  €  Z,  a  f  0  and  n  e  N;  if  a  =  1  then  we  have  an  algebraic  integer. 
It  is  a  non-trivial  fact  that  the  set  A  of  algebraic  integers  is  closed  under  addition 
and  multiplication  but  not  under  division.  Thus  A  behaves  very  much  like  Z,  and  we 
have  at  hand  a  genuine  generalization  of  the  notion  of  integer. 

These  examples  may  serve  to  highlight  the  extraordinary  freedom  that  the  axio¬ 
matic  approach  brings  into  mathematics.  Some  critics  complain,  however,  that  in 
exercising  this  freedom,  mathematicians  tend  to  “go  too  far”;  but  that  is  another 
matter  altogether  and  we  shall  not  address  it  here. 

TAIL-PIECE.  Mr  T  B  Nagarajan  of  Thanjavur  has  sent  me  the  following  problem: 
Find  four  distinct  positive  integers  such  that  the  sum  of  any  two  of  them  is  a  square. 
He  writes  that  the  problem  is  not  too  hard  if  the  restriction  on  positivity  is  removed, 
or  if  one  is  content  with  solutions  having  very  large  integers.  In  support  of  this  state¬ 
ment,  he  lists  the  following  solutions: 

{55967, 78722, 27554, 10082} ,  { 15710, 86690, 157346, 27554} . 

Readers  are  invited  to  take  a  crack  at  the  problem.  (To  find  a  triple  with  the  stated 
property  is  much  easier;  an  example  is  {6, 19,  30}.  Readers  may  enjoy  trying  to  list 
further  such  triples  before  going  on  to  the  more  challenging  four-number  problem.) 


Shailesh  A  Shirali 
Rishi  Valley  School 
Rishi  Valley  517  352 
Andhra  Pradesh 


6  Historically,  many  of  these  developments  were  a  result  of  efforts  to  prove  Fermat’s  last  theorem.  See 
Resonance ,  Volume  1,  No.  1  for  more  details. 


To  Find  Four  Distinct  Positive  Integers  such 
that  the  Sum  of  Any  Two  of  them  is  a  Square 

S  H  Aravind 

The  problem  is  to  find  four  distinct  positive  integers  such  that  any  two  of  them  add 
up  to  a  square.  Let  a,  b,  c,  d  with  a  <  b  <  c  <  d  be  four  positive  integers  such  that 
the  sum  of  any  two  of  them  is  a  square.  Observing  that 

aAbAcAd—  (<3  +  /?)  +  (c  +  d ), 

a-\-b-\-c-\-d  —  (a  c)  (b  A  d ), 
aAbAcAd—  (a  A  d)  A  {b  A  c ), 

we  need  to  find  a  number  which  can  be  written  as  a  sum  of  two  non-zero  squares  in 
three  different  ways.  We  proceed  to  find  such  a  number. 

To  begin  with,  note  that  if  two  numbers  n  and  n'  can  each  be  expressed  as  a  sum 
of  two  squares,  then  nn'  can  also  be  so  expressed  in  two  ways.  Indeed,  if 

n  =  k2  +  I2,  n’  =  k'2  +  /'2, 

then 

nn'  =  (kk'  +  lt’)2  +  (kl'-lk’)2 
=  (kl'  +  lk')2  +  {kk'  -  ll')2. 

Start  with  25  and  13  both  of  which  are  sums  of  two  squares,  25  =  32  +  42, 
13  =  22  +  32.  By  the  identity,  25  x  13  =  325  can  be  expressed  as  a  sum  of  two 
squares:  325  =  102+152  =  62+172  =  l2  + 182.  (Note  that  102+152  =  52(22  +  32).) 

We  show  that  1 3  x  252  has  three  representations  as  a  sum  of  two  squares  and  gives 
us  a  solution.  Consider  the  following  representations  (among  others): 

8125  =  302  +  852  =  502  +  752  =  582  +  692. 
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Thus,  we  take  a  +  b  +  c  +  d  -  8125  and  look  for  solutions  a ,  b,  c,  d  in  positive 
integers.  Then  a  +  b,  a  +  c,  a  +  d,  b  +  c,  b  +  d,  c  +  d  are  precisely  302,  852,  502,  752, 
582,  692  in  some  order.  We  have 

a  +  b<a  +  c<a  +  d<b  +  d<c  +  dt 
and  a  +  b<a  +  c<b  +  c<b  +  d<c  +  d. 

We  arbitrarily  take  b  +  c  to  be  less  than  a  +  d.  So  we  have  a  +  b  -  302,  the  least  of 
the  squares,  a  +  c  =  502,  the  next  smallest,  and  c  +  b  =  582.  We  get  c  —  b—  1600  and 
solving  for  c,  b  we  have  c  =  2482,  b  =  882.  From  this  we  get  a  =  18,  d  -  4743.  Thus 
{ 18,  882, 2482, 4742}  is  a  set  of  four  positive  integers  with  the  required  property. 

The  same  method  can  give  large  integer  solutions  too.  For  example,  the  following 
solutions  can  be  obtained  by  choosing  suitable  squares: 

{ 4 1 90,  1 02 1 0,  39074,  83426 } ,  { 7070,  29794, 7 1 330,  1 72706 } . 


S  H  Aravind 
12,  First  Main  Road 
Ponmeni  Jayanagar 
Madurai  625  010 
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Backet’s  Problem 


B  Bagchi 

A  grocery  shopkeeper  keeps  five  stones  of  different  weights.  He  is  able  to  use 
a  common  balance  and  weigh  out  quantities  ranging  from  1  to  100  kg,  in  steps 
of  1  kg.  What  are  the  weights  of  these  five  stones? 

The  above  is  the  problem  “100  kg  with  five  stones”  posed  by  R  Yusufzai  in  the 
“Think  it  Over”  column  of  the  July  1996  issue  of  Resonance.  A  much  better  problem 
will  result  if  the  figure  100  is  replaced  by  121 .  This  is  because  the  question  “What  are 
the  weights  of  these  five  stonesT ’  seems  to  suggest  that  there  are  uniquely  determined 
weights  to  be  found!  However,  as  may  easily  be  verified,  the  weights  in  kg  of  the 
stones  might  be  1,3,  9,  27  and  m,  where  m  is  any  integer  in  the  range  60  <  m  <  81. 
In  fact,  there  are  many  other  solutions  to  the  problem  as  posed.  If,  however,  it  was 
given  that  the  grocer  can  weigh  any  object  of  weight  between  1  kg  and  121  kg  (in 
steps  of  1  kg)  using  his  five  stones,  then  the  weights  (in  kg)  of  the  stones  must  have 
been  1,  3,  9,  27  and  81.  This  is  the  case  k  =  5  of  the  result  stated  and  proved  below. 

The  problem  is  a  well-known  variation  of  an  old  problem  due  to  Bachet  (see 
Suggested  Reading).  In  the  original  binary  version,  the  grocer  cannot  subtract,  so 
he  must  put  the  stones  in  one  pan  and  the  object  in  the  other.  Mr  Yusufzai’s  problem 
is  an  instance  of  the  ternary  version  w'here  this  restriction  is  removed.  The  general 
problem  (in  its  ternary  version)  may  be  stated  as  follows: 

Given  a  positive  integer  k,  find  the  largest  integer  AT  such  that  any 
object  whose  weight  is  an  integer  between  1  and  AT  (ends  included)  can 
be  weighed  using  k  stones  of  suitable  integral  weights.  In  this  notation, 
the  problem  is  to  show  that  AT  >  100. 

In  fact,  we  have: 

THEOREM.  AT  =  J  9~~.  If  k  stones  are  such  that  all  integral  weights  between  1 
and  AT  can  be  measured  using  them,  then  the  weights  of  these  stones  must  be  3y, 
0  <  j  <  k  -  1 . 
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This  is,  essentially,  Theorem  141  in  the  book  by  Hardy  and  Wright  (see  Suggested 
Reading). 

In  order  to  prove  this,  we  must  convert  it  into  a  precise  mathematical  statement. 
To  this  end,  let  ao, . . . ,  a^- 1  be  the  (positive  integral)  weights  of  k  stones.  In  order  to 
weigh  an  object  of  integral  weight  m,  the  grocer  places  the  object  together  with  some 
of  the  stones  on  the  right  pan  (say)  and  puts  some  other  stones  on  the  left  pan.  For 
0  <  j  <  k  -  1,  put  £j  =  1  if  the  stone  of  weight  aj  is  placed  on  the  left  pan,  ej  =  —  1 
if  it  is  on  the  right  pan,  ej  —  0  if  it  is  not  used.  Since  the  two  pans  must  balance,  we 
get 

k- 1 

m  =  J'  EjCLj .  where  ej  €  {0, 1,  —  1 }  for  0  <  j  <  k  —  1.  (1) 

j  =  0 

This  leads  us  to 

DEFINITION.  If  A  -  (ao,  •  •  •  ,  flfc-i }  is  a  finite  set  of  positive  integers,  then  the 
capacity  C(A)  of  A  is  the  largest  integer  M  such  that  for  every  integer  m  in  the 
range  1  <  m  <  M,  equation  (1)  has  a  solution. 

Informally,  the  capacity  C(A)  is  the  largest  M  such  that  all  weights  between  1  and  M 
can  be  measured  using  k  stones  whose  weights  are  in  A.  In  terms  of  this  definition, 
the  above  theorem  may  be  restated  as  follows: 

THEOREM.  If  A  is  of  size  k,  then  C(A)  <  3  ^  1  •  Equality  holds  here  if  and  only  if 
A  =  [V  :  0  <j  <  k  -  1}. 

To  prove  the  theorem,  note  that  if  m  can  be  written  as  in  (1),  then  so  can  —  m  (just 
change  the  signs  of  all  £js)\  also,  trivially,  m  =  0  can  be  written  thus  (take  £j  = 
0  for  all  j).  Therefore,  if  C(A)  =  M ,  then  all  the  2 M  +  1  integers  m  in  the  range 
-M  <  m  <  M  can  be  expressed  as  in  (1).  But  there  are  three  choices  for  £j  for 
each  j,  hence  only  3k  choices  for  the  right  hand  side  of  (1).  Hence,  2 M  +  1  <  3^, 
or  C(A)  <  (3k  -  l)/2.  Now,  if  we  take  A  =  {3J  \  0  <  j  <  k  -  \  ] ,  then  for 
I  <  m  <  {3k  -  l)/2  write  [(3^  -  l)/2]  -  m  in  base  3: 

k- 1 

t(3*-l)/2  ]-m=  %6j3j, 

j  =  0 

where  8j  €  {0, 1, 2} .  Put  £j  =  1  -  Sj.  Then  (1)  holds.  Thus  C(A)  >  (3k  -  1 ) /2  for 
this  set.  Together  with  the  previous  inequality,  we  get  C(A)  =  (3k  -  1 ) /2. 

Only  the  uniqueness  part  of  the  theorem  remains  to  be  proved.  In  fact,  this  is  the 
only  non-trivial  and  interesting  part.  To  prove  this,  let  A  =  {ao, . . , ,  a^-i]  have 
capacity  N^.  Since,  now,  equality  holds  in  the  inequality  C(A)  <  (3k  -  l)/2  which 
appears  in  the  statement  of  the  theorem,  the  proof  of  the  inequality  shows  that  every 
integer  m  in  the  range  [-(3*  -  l)/2]  <  m  <  [(3^  -  l)/2]  has  a  unique  representation 
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(1);  conversely,  any  m  of  the  form  (1)  belongs  to  this  range.  Therefore,  letting  X  be 
an  indeterminate,  we  get 

k- 1 

{X~a‘  +  1  +  Xa>)  =  Yj  xm  (2) 

/  =  0  ,  ,  .  3*  _  i 

M  < 

as  may  be  verified  by  multiplying  out  the  left-hand.  Since,  in  particular,  the  largest 
integer  (viz.,  =  o  aj)  °f  ^e  form  (1)  must  be  the  largest  integer  in  the  range 

t/c  _  1  o/c  _  1 

[—  --'2  ,  2  h  we  a^S0  ^ave 


Using  (3)  and  a  little  algebra,  (2)  simplifies  to 


£-1 


n 


X3aJ  - 1 

x°j  - 1 


a  -  i 


(3) 


(4) 


Now  fix  j\  0  <  j  <  k  -  1.  Let  w  be  a  primitive  3u;th  root  of  unity.  That  is,  w  is  a 
complex  number  such  that  wl  =  1  if  and  only  if  /  is  an  integral  multiple  of  3a j.  (For 
instance,  we  may  take  w  =  exp(2^V-I/3 aj).)  Then  w  is  a  zero  of  the  left-hand  side, 

and  hence  also  of  the  right-hand  of  (4).  Thus,  =  1.  So  3a j  divides  3k .  That  is, 
aj  £  { 31 ,  0  <  i  <  k  —  1 } .  Since  this  holds  for  all  j,  we  have  AC  {3'  :  0  <  /  <  k  —  1 } . 
Since  both  sets  have  size  k,  we  must  have  A  -  (3'  :  0  <  i  <  k  —  1 } .  This  proves  the 
uniqueness  of  the  set  of  given  size  and  maximum  capacity. 

The  reader  may  like  to  look  up  the  proof  in  the  book  by  Hardy  and  Wright,  which  is 
very  different  from  the  proof  given  here.  It  is  a  clever  use  of  mathematical  induction. 


TAIL-PIECE.  Bachet  is  better  remembered  by  mathematicians  for  another  reason.  It 
was  on  Bachet’s  edition  of  Diophantus’  Arithmetic  that  Fermat  scribbled  his  famous 
marginal  notes.  Bachet  was  also  the  first  man  to  state,  (without  proof)  what  is  now 
known  as  Lagrange’s  four  square  theorem:  every  natural  number  is  the  sum  of  at 
most  four  perfect  squares. 


Suggested  Reading 

[1]  F  Schuh.  The  Master  Book  of  Mathematical  Recreations.  Dover.  New  York, 
pp  115-118,  1968. 

[2]  G  H  Hardy  and  E  M  Wright.  An  Introduction  to  the  Theory  of  Numbers.  Oxford 
Univ.  Press.  London,  pp  1 15-1 17,  1971 . 


B  Bagchi 

Statistics  and  Mathematics  Unit 
Indian  Statistical  Institute 
Bangalore  560  059 
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Mathematical  Induction 

An  Impresario  of  the  Infinite 

B  Sury 


In  the  natural  sciences,  if  a  certain  phenomenon  is  observed  to  occur  a  number  of 
times,  often  a  general  law  is  formulated.  This  process  is  called  empirical  induction. 
In  general,  any  reasoning  that  draws  a  general  conclusion  based  on  verification  of 
particular  cases  is  known  as  induction.  But,  in  mathematics,  a  statement  involving 
a  natural  number  n  might  turn  out  to  be  erroneous  even  if  it  happens  to  be  true  for 
the  first  ten,  or  thousand,  or  even  million  natural  numbers.  For  instance,  the  numbers 
22°  +  1  =  3,  22'  +  1  =  5,  22'  +  1  =  17,  223  +  1  =  257,  22“  +  1  =  65537  are  all 
prime  numbers  and  the  17th  century  mathematician  Pierre  de  Fermat  suggested  that 
2t  +  1  must  be  prime  for  every  positive  integer  n.  However,  a  century  later,  another 
great  mathematician  Leonhard  Euler  showed  that  2Z  -f  1  =  641  x  6700417.  An  even 
more  convincing  example  is  the  following.  If  we  evaluate  the  expression  99 1  1  for 

small  values  of  n,  the  resulting  number  is  not  the  square  of  a  whole  number.  But,  for 
n  -  12055735790331359447442538767,  the  value  is  a  perfect  square.  Indeed,  this  is 
the  smallest  value  of  n  for  which  it  is  a  square!  This  tells  us  that,  in  mathematics,  a  lot 
of  care  is  needed  to  establish  an  induction  procedure  which  proves  a  mathematical 
theorem  for  each  of  an  infinite  sequence  of  cases,  without  exception.  The  method  of 
mathematical  induction  is  such  a  procedure.  Let  us  start  with  a  simple  example. 

Suppose  we  want  to  prove  the  statement  that  2 n  >  n  for  every  natural  number 
n.  Clearly,  this  inequality  holds  for  n  -  1.  Now,  to  prove  the  inequality  for  all  nat¬ 
ural  numbers,  we  consider  an  arbitrary  natural  number  k  >  1.  We  assume  that  the 
inequality  2 k  >  k  holds.  Then,  for  the  next  natural  number  k  + 1, 2k+l  =  2  x  2k  >2 k 
by  our  assumption  that  2k  >  k.  Now,  2k  =  k  +  k  >  k  +  l,  so  that  the  inequality 
2k+l  >  k  +  1  follows.  Thus,  we  have  proved  that  if  the  inequality  is  true  for  any 
particular  k,  then  it  is  also  true  for  k  +  1. 

The  crux  of  the  above  argument  rests  on  the  points: 

(0)  Given  an  infinite  sequence  of  statements  Pr,Pr+i,...,  we  would  like  to 
prove  that  there  is  a  ‘next’  to  any  statement,  and  each  particular  statement  can 
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be  reached  in  a  finite  number  of  steps  starting  from  the  ‘first’  state¬ 
ment  Pr. 

(1)  There  is  a  general  method  of  proving  that  for  any  n  >  r,  if  Pn  is  true,  then 
Pn+ 1  is  true;  and 

(2)  The  first  statement  Pr  is  known  to  be  true. 

It  is  believed  that  these  rules  of  logic  are  as  fundamental  to  mathematics  as  the 
classical  rules  of  Aristotelian  logic. 

It  is  necessary  to  verify  both  steps  (1)  and  (2)  to  avoid  landing  in  absurdities. 
For  example,  if  step  (2)  that  ‘starts  induction’  is  not  verified,  one  can  ‘prove’  that 
all  natural  numbers  are  equal  as  folfows.  For,  simply  denote  by  Pn  the  statement 
*n  =  n  +  V .  Then,  obviously,  if  Pn  is  assumed  to  be  true,  then  n  =  n  +  1  and  so 
n  +  1  =  n  - 1-2,  which  means  that  Pn+\  is  also  true. 

Everybody  has  seen  instances  of  mathematical  induction  being  applied.  The 
summing  of  arithmetic  and  geometric  progressions  are  usually  done  by  this  method. 

An  important  point  is  in  order  here.  Mathematical  induction  can  be  used  to  prove 
a  statement  that  is  given  to  begin  with.  As  for  coming  up  with  that  statement  itself 
(as  a  guess,  say),  it  is  altogether  a  different  matter.  Therein  lies  the  creative  element 
which  cannot  be  pinned  down  by  any  general  rules. 

As  we  observed  earlier,  mathematical  induction  is  a  procedure  that  involves  such 
extremely  ‘believable’  logic  that  we  accept  it  as  valid  reasoning.  But,  interestingly, 
we  can  actually  prove  its  validity  if  we  assume  another  believable  principle  which 
is  that  any  non-empty  set  of  positive  integers  has  a  least  number.  That  this  principle 
gives  a  proof  of  the  validity  of  mathematical  induction  is  left  as  an  exercise  to  the 
reader. 

We  now  proceed  to  give  various  instances  where  the  method  of  mathematical 
induction  appears  and  proves  fruitful. 

The  following  is  a  slight  variant  of  the  form  in  which  induction  is  used: 

To  prove  an  infinite  sequence  Pk,  Pk+u  • . . ,  of  assertions,  one  verifies  the  two  steps: 

(i)  Pk  is  true. 

(ii)  For  any  n  >  k,  if  we  assume  that  all  the  assertions  Pk,  Pk+\ , . . . ,  Pn  hold  good, 
then  Pn+ 1  also  holds  true. 

Induction  in  Geometry 

As  an  example,  let  us  show  that  the  sum  of  the  interior  angles  of  a  (not  necessarily 
convex)  polygon  of  n  sides  is  180(h  -  2)  degrees  for  all  n  >  3.  Call  this  statement 
Pn.  P3  is  true  as  the  sum  for  a  triangle  is  180  degrees.  P4  is  also  true  since  any 
quadrilateral  can  be  split  into  two  triangles. 

Now,  let  n  >  4  and  we  assume  that  Pk  is  true  for  k  =  3,4,  ...,n-  1.  Let 
A 1 ,  A 2, . . . ,  An  be  the  vertices  of  a  polygon  with  n  sides.  We  first  notice  that  there  is 
always  a  diagonal  (i.e.,  a  segment  AiAj  that  is  not  a  side)  that  splits  the  polygon  into 
two  with  smaller  numbers  of  sides.  To  see  this,  consider  three  neighbouring  vertices 
A,  B,  C.  Consider  all  the  rays  emanating  from  B  and  filling  the  interior  angle  ABC. 
We  terminate  any  ray  when  it  first  meets  a  side  or  a  vertex  of  the  polygon.  Either  all 
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Figure  4.1  Figure  4.2 

these  rays  intersect  only  one  side  (Figure  4.1)  or  they  intersect  more  than  one  side 
(Figure  4.2).  In  the  first  case,  AC  is  a  diagonal  that  splits  the  original  polygon  into  a 
triangle  and  a  polygon  with  n  —  1  sides.  In  the  second  case  at  least  one  ray  terminates 
on  a  vertex  other  than  A  or  C.  Call  such  a  vertex,  D.  Then,  BD  is  a  diagonal  splitting 
the  polygon  into  two,  with  smaller  numbers  of  sides. 

Therefore,  in  general,  let  A\Ai c  denote  a  diagonal  which  splits  the  polygon 
A\A2- . .  An  into  the  polygons  A\A2- . .  Ak  and  A^Ak+i  -  ■  ■  AnA\  of  k  and  n  -  k  +  2 
sides  respectively.  By  induction  hypothesis,  Pk  and  Pn-k+ 2  are  true,  i.e.,  the  sum  of 
the  interior  angles  of  the  original  polygon  A\  A2. . .  An  is  180 (k  -  2)  +  180 (n  -  k)  = 
180(w  -  2)  degrees.  So,  Pn  is  true,  which  proves  by  induction  that  Pr  is  true  for  every 
r  >  3. 

After  this  standard  example,  we  look  at  an  example  where  it  may  not  be  quite 
apparent  that  induction  can  be  used. 


The  Marriage  Problem 

The  classical  ‘marriage  problem’  can  be  stated  as  follows.  Suppose  that  each  of  a 
set  of  girls  is  acquainted  with  a  subset  from  a  given  set  of  boys.  Is  it  possible  for 
each  girl  to  marry  one  of  her  acquaintances?  Obviously,  a  necessary  condition  is  that 
every  set  of  m  girls  be  collectively  acquainted  with  at  least  m  boys.  That  this  suffices 
is  the  assertion.  Here  is  a  proof  by  induction. 

Let  n  denote  the  number  of  girls.  If  n  -  1,  the  assertion  is  trivial.  If  n  >  1  and 
if  it  is  true  that  every  set  of  m  girls,  1  <  m  <  n,  has  at  least  m  4-  1  acquaintances, 
then  an  arbitrary  girl  is  allowed  her  choice  and  the  rest  are  referred  to  the  induction 
hypothesis.  If,  on  the  other  hand,  some  group  of  m  girls,  1  <  m  <  n,  has  precisely 
m  collective  acquaintances,  then  this  set  of  m  girls  is  married  off  by  induction  and, 
it  is  indeed  true  that  the  rest  of  the  n  -  m  girls  satisfy  the  necessary  condition  with 
respect  to  the  remaining  boys.  If  this  were  not  so,  then  some  set  of  5  spinsters  with 
1  <  s  <  n-m  would  know  fewer  than  5  bachelors,  and  this  set  of  5  spinsters  together 
with  the  m  just-married  girls  would  have  known  fewer  than  s  +  m  boys. 

The  reader  is  invited  to  apply  induction  to  solve  the  following  two  problems. 

EXERCISE.  Consecutive  Number  Problem:  Agatha  and  Beula  are  ‘given’  two  con¬ 
secutive  natural  numbers  n  and  n  - f  1.  Both  know  that  the  numbers  are  consecutive 
but  neither  knows  whose  number  is  bigger.  After  every  minute  a  beep  is  heard  and 
each  is  asked  to  simultaneously  say  out  aloud  whether  she  knows  the  other’s  number. 
Prove  by  induction  on  the  smaller  number  n  that  the  person  who  has  the  number  n 
guesses  correctly  after  precisely  the  nth  beep. 
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EXERCISE.  Macaulay  Expansion:  Given  a  natural  number  d  >  2,  let  us  write  down 
the  7-tuples  of  positive  integers  in  a  strictly  decreasing  order.  Order  the  tuples  lex¬ 
icographically.  Prove  that  the  number  of  tuples  appearing  prior  to  a  particular  tuple 

(kd,kd- i,...,/q)is  precisely  (  ^  )  +  )  +  •••  +  (  ^  ). 

This  proves  that  any  n  has  a  unique  expansion 


where  kd  >  kd~\  >  •  •  •  >  k\.  Here  yrj  denotes  the  binomial  coefficient  which  is  0 
when  n  <  r. 

Induction  Incognito — Use  of  a  ‘Dummy’  Element 

Look  at  the  following  statement: 

If  a\  <  a  2  <  •  •  •  <  <2,1+1  are  integers  from  the  set  { 1, 2, . . . ,  2  n],  then  divides  aj 
for  some  i  <  j. 

This  can  be  proved  by  the  ‘pigeon-hole  principle’  as  follows.  Write  a\  -  2kif  with  // 
odd.  Then,  l\ , . . . ,  ln+\  being  n+ 1  odd  numbers  between  1  and  2 n  cannot  be  different. 
If  1 1  =  lj  =  l  with  i  <  j,  then,  clearly  <2/  =  2kil  divides  aj  =  2kjL 

In  terms  of  economy  and  elegance,  this  is  unbeatable.  However,  we  find  to  our 
surprise  that  even  induction  works  and,  in  fact,  proves  the  following  more  general 
statement: 

Let  r  >  1,  and  let  A  C  {1,2,...,  2rn]  be  a  subset  of  cardinality  (2r  —  1)«  +  1.  Then, 
there  exists  a  chain  of  r  +  1  elements  of  A  with  each  dividing  the  next. 

Let  us  prove  the  original  statement  (corresponding  to  r  =  1).  Note  that  it  is  clearly 
true  for  n  -  1.  Assume  it  is  true  for  n.  Consider  now  n  +  2  numbers  a\  <  •  •  •  <  an+ 2 
among  1  to  2w  +  2.  If  an+  \  <  2 n,  we  are  done  by  the  induction  hypothesis.  In  the 
contrary  case,  we  must  have  an+\  =  2n  +  1  and  an+ 2  =  2n  -f  2.  If  one  of  the  af  s 
is  n  +  I,  we  are  done  as  it  divides  <2,1+2.  So,  suppose  <2/  ^  n  +  1  for  any  i.  We  may 
also  assume  that  none  of  the  n  numbers  <21, . . . ,  an  divides  another  or  else  we  have 
nothing  to  prove.  Now,  we  put  in  this  new  number  n  +  1  (as  a  ‘dummy  element’)  to 
get  n  +  1  numbers  between  1  and  2 n.  By  induction  hypothesis,  one  of  these  n  +  1 
numbers  divides  another.  Since  this  has  happened  only  after  the  advent  of  the  new 
number  n  +  1,  it  must  be  that  either:  (i)  some  <2/  (/  <  n)  divides  n  +  1,  or  (ii)  n  +  1 
divides  some  <2/  (/  <  n).  But,  clearly  (ii)  cannot  happen  as  n  +  1  ^  an  <  2n.  Thus, 
some  <2/  (/  <  n)  divides  n  -I-  1  and,  therefore,  divides  2n  +  2  =  <2^+2  also.  Thus,  we 
used  n  +  1  as  a  ‘dummy  element’  in  this  proof. 

The  reader  is  urged  to  complete  the  proof  of  the  general  statement  along  the  same 
lines. 

Now,  we  come  to  a  final  example  where  induction  appears  in  a  different  guise. 
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Backward  Induction 


If  a  statement  is  easily  proved  for  a  particular  infinite  subsequence  of  positive  inte¬ 
gers,  it  might  be  worthwhile  to  try  and  see  whether  ‘backward  induction’  works.  By 
this,  we  mean  the  following.  Suppose  we  want  to  prove  statements  Pn  for  all  positive 
integers  n.  Suppose,  further,  that  it  is  easy  to  check  the  veracity  of  Pn  for  all  n  in  an 
infinite  sequence  of  natural  numbers.  Then,  if  we  check  that  for  any  m  >  2  the  truth 
of  Pm  implies  the  truth  of  Pm-\,  the  statements  Pn  follow  for  all  positive  integers  n. 

An  instance  is  the  familiar  arithmetic  mean-geometric  mean  inequality 


Pn 


for  arbitrary  non-negative  real  numbers  £2/,  where  equality  holds  if,  and  only  if  all  the 
numbers  are  equal. 

On  the  one  hand,  we  prove  this  for  n  =  2k  by  induction  on  k.  Let  k  =  1.  Then, 
(a\  +  a2)2  >  4^1  C2  with  equality  exactly  when  a\  =  a2,  since  the  difference 
(a i  +  a2)2  -  4£2j£22  =  (at  -  £22)2-  Assume  that  Pn  is  true  for  n  =  2r,r  <  k.  Let 
<3/,  i  <  2k+l,  be  non-negative  real  numbers.  Then,  JL< 2*+i  a/  =  £/< 2k  bi  where 
bi  =  £22/—  l  +  U2\.  Therefore, 


=2w‘+i42‘  n  a> = 2<fc+i)2*+i  n  ° /. 

/<  2k+l  i<2k+l 


which  proves  that  P2k+\  is  true.  Hence,  by  induction,  is  valid  for  all  r  >  1. 
Moreover,  note  that  the  above  proof  also  shows  that  the  equality  (2/< 2k+l  a/)2*+'  = 
2(*+i)2‘+i  n.$24+|  a. 

implies  that  all  inequalities  occurring  on  the  way  are  equalities, 
which  again  proves  by  induction  that  equality  can  hold  in  iV  if,  and  only  if  all  the 
£2/’s  are  equal. 

On  the  other  hand,  for  any  m,  the  validity  of  Pm  implies  the  validity  of  Pm_i  as 
follows: 

Let  £21, . . .  ,£2m_i  be  given.  Consider  am  =  £/<m_i  at.  Then, 


Z  a‘ 


m 


m  \m 


i<m—  1 


m  —  1 


z « 

i<m—  1 


m 


n «-)( z 

i<m-\  i<m- 1  i<m 

Once  again,  by  induction,  equality  implies  that  all  the  numbers  are  equal. 
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To  end  our  discussion,  the  reader  is  invited  to  apply  induction  on  the  positive 
integer  p  below  to  prove  the  following  result  which  solves  an  interesting  two-player 
game  called  Euclid. 

Let  ( p ,  q)  be  a  pair  of  positive  integers  satisfying  p  >  q.  Each  player  subtracts  a 
multiple  of  the  smaller  number  from  the  bigger  one  without  making  the  result  nega¬ 
tive.  The  winner  is  the  one  first  hitting  the  highest  common  factor  of  p  and  q.  Then, 
there  is  a  winning  strategy  for  the  first  player  if  and  only  if  q  <  Jjr(V5  -  1  )p. 
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On  the  Infinitude  of  the  Prime  Numbers 

Euler’s  Proof 

Shailesh  A  Shirali 

Euclid’s  elegant  proof  that  there  are  infinitely  many  prime  numbers  is  well  known. 
Euler  proved  the  same  result,  in  fact  a  stronger  one,  by  analytical  methods.  This 
article  gives  an  exposition  of  Euler’s  proof  introducing  the  necessary  concepts 
along  the  way. 


Introduction 

In  this  article,  we  present  Euler’s  very  beautiful  proof  that  there  are  infinitely  many 
prime  numbers.  In  an  earlier  era,  Euclid  had  proved  this  result  in  a  simple  yet  elegant 
manner.  His  idea  is  easy  to  describe.  Denoting  the  prime  numbers  by  p\,P2,P3, 
such  that  p\  =  2,  p2  =  3,P3  =  5, . . . ,  he  supposes  that  there  are  n  primes  in  all,  the 
largest  being  pn.  He  then  considers  the  number  N  where 


N  =  P\P2P3---Pn  +  1, 


and  asks  what  the  prime  factors  of  N  could  be.  It  is  clear  that  N  is  indivisible  by 
each  of  the  primes  p\,  P2,  P3,  ■  ■  ■ ,  Pn  (indeed,  N  =  1  (mod  p{)  for  each  /,  1  <  i  <  n). 
Since  every  integer  greater  than  1  has  a  prime  factorization,  this  forces  into  existence 
prime  numbers  other  than  the  p/.  Thus  there  can  be  no  largest  prime  number,  and  so 
the  number  of  primes  is  infinite. 

The  underlying  idea  of  Euler’s  proof  is  very  different  from  that  of  Euclid’s  proof. 
In  essence,  he  proves  that  the  sum  of  the  reciprocals  of  the  primes  is  infinite ;  that  is, 

1  1  1 

- b - 1 - b  •  ■  •  =  00. 

P 1  P2  P3 

In  technical  language,  the  series  £/  1  / p\  diverges.  Obviously,  this  cannot  possibly 
happen  it  there  are  only  finitely  many  prime  numbers.  The  infinitude  of  the  primes 
thus  follows  as  a  corollary.  Note  that  Euler’s  result  is  stronger  than  Euclid’s. 
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Convergence  and  Divergence 

A  few  words  are  necessary  to  explain  the  concepts  of  convergence  and  divergence 
of  infinite  series.  A  series  a\  +  a^  +  a3  +  •  •  ■,  is  said  to  converge  if  the  sequence  of 
partial  sums, 

a\,a\  +  a2,  a\  4-  £2  +  •  •  • » 

approaches  some  limiting  value,  say  L;  we  write,  in  this  case,  a\  -  L.  If, 
instead,  the  sequence  of  partial  sums  grows  without  any  bound,  we  say  that  the  series 
diverges,  and  we  write,  in  short1,  a\  =  oo. 

Examples. 

•  The  series  1/1  +  1/2+  1 /4  +  •  •  •  +  1  /2n  +  •  •  •  converges  (the  sum  is  2,  as  is 
easily  shown). 

•  The  series  1/1  +  1/3  +  1/9  +  •  •  •  +  1/3”  +  •  •  •  converges  (the  sum  in  this  case 
is  3/2). 

•  The  series  1  +  1  +  1  +  •  •  •  diverges  (rather  trivially). 

•  The  series  1  —  1  +  1  —  1  +  1  —  1  +  1  —  •  •  •  also  fails  to  converge,  because  the 
partial  sums  assume  the  values  1,0,  1,0,  1,0,...  and  this  sequence  clearly 
does  not  possess  a  limit. 

•  A  more  interesting  example:  1  -  1/2  +  1/3  -  1/4  +  •  •  •  a  careful  analysis 
shows  that  it  too  is  convergent,  the  limiting  sunT  being  In  2  (the  natural  loga¬ 
rithm  of  2). 


Divergence  of  the  Harmonic  Series  Si li 

In  order  to  prove  Euler’s  result,  namely,  the  divergence  of  £  1/ Pi,  we  need  to  estab¬ 
lish  various  subsidiary  results.  Along  the  way,  we  shall  meet  other  examples  of  diver¬ 
gent  series.  To  start  with,  we  present  the  proof  of  the  statement  that 

1111 
—  +  —  +  —  +  —  +  •••  —  oo. 

12  3  4 

This  rather  non-obvious  result  is  usually  referred  to  as  the  divergence  of  the  har¬ 
monic  series.  The  proof  given  below  is  due  to  the  Frenchman  Nicolo  Oresme  and  it 
dates  to  about  1350.  We  note  the  following  sequence  of  equalities  and  inequalities: 

1  _  1 

T  “  T’ 

1  _  1 

2  ”  2’ 


1  A  statement  of  the  form  Zfl/  =  oo  is  to  be  regarded  as  merely  a  short  form  for  the  statement  that  the 
sums  a\ ,  a\  +  ai,  a\  +  aj  +  03, . . . ,  do  not  possess  any  limit.  It  is  important  to  note  that  00  is  not  to  be 
regarded  as  a  number!  We  shall  however  frequently  use  phrases  of  the  type  ‘x  =  00’  (for  various  quantities 
x)  during  the  course  of  this  article.  The  meaning  should  be  clear  from  the  context. 
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1  1  1  1  _  1 
3+4 >4+4“2 
1  1  1  1  1  1  1  1  1 
5  +  6  +  7  +  8>8  +  8  +  8  +  8_2’ 


1  111 

-(-  -  +  •  •  •  -(-  -  ]>  -  +  -  + 

10  16  16  16 


+ 


1 

16 


1 

2’ 


and  so  on.  We  see  that  it  is  possible  to  group  consecutive  sets  of  terms  of  the  series 
1/1  +  1/2+  1/3  +  •  •  •,  in  such  a  manner  that  each  group  has  a  sum  exceeding  1/2. 
Since  the  number  of  such  groups  is  infinite,  it  follows  that  the  sum  of  the  whole  series 
is  itself  infinite.  (Note  the  crisp  and  decisive  nature  of  the  proof!). 

Based  on  this  proof,  we  make  a  more  precise  statement.  Let  S(n)  denote  the  sum 


1 

2 


1 

+  3  + 


+ 


n 


e.g.,  5(3)  =  11  /6.  Generalizing  from  the  reasoning  just  used,  we  find  that 

S{2n)>\  +  n-.  (1) 

(Please  fill  in  the  details  of  the  proof  on  your  own.)  This  means  that  by  choosing  n 
to  be  large  enough,  the  value  of  5(2”)  can  be  made  to  exceed  any  given  bound.  For 
instance,  if  we  wanted  the  sum  to  exceed  100,  then  (1)  assures  us  that  a  mere  2 198 
terms  would  suffice!  This  suggests  the  extreme  slowness  of  growth  of  S(ri)  with  n. 
Nevertheless  it  does  grow  without  bound;  loosely  stated,  5(oo)  =  oo. 

The  result  obtained  above,  (1),  can  also  be  written  in  the  form, 


S(n)  >  1  +  ^  log2«. 


EXERCISE.  Write  out  a  proof  of  the  above  inequality. 

A  much  more  accurate  statement  can  be  made,  but  it  involves  calculus.  We  consider 
the  curve  Q  whose  equation  is  y  =  l/x,x  >  0.  The  area  of  the  region  enclosed  by 
Q,  the  x-axis  and  the  ordinates  x  =  1  and  x  =  n  is  equal  to  /”  + dx ,  which  simplifies 
to  In  n.  Now  let  the  region  be  divided  into  (n  -  1)  strips  of  unit  width  by  the  lines 
x  =  1,  x  =  2,  x  =  3, . . . ,  x  =  n  (see  Figure  5.1). 

Consider  the  region  enclosed  by  Q,  the  x-axis,  and  the  lines  x  =  i  -  l,  x  =  i. 
The  area  of  this  region  lies  between  1  //  and  1  /(/  -  1),  because  it  can  be  enclosed 
between  two  rectangles  of  dimensions  lxl//  and  1  x  1  /(/  —  1 ),  respectively.  (A  quick 
examination  of  the  graph  will  show  why  this  is  true.)  By  letting  /  take  the  values 
2,  3, 4, . . .  n,  and  adding  the  inequalities  thus  obtained,  we  find  that 

11  1  11  1 

-  +  _  +  ...  +  _  <  ,n„  <  T  +  _  +  ...  +  _.  (2) 

Relation  (2)  implies  that 

,1111  1 

In  n  +  -  <  -  +  -  +  -  +  •••  +  -  <  In  «  +  1, 

n  12  3  n 


(3) 
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Figure  5.1  The  figure  shows  how  to  bound  In  n  by  observing  that  In  n  is  the  area 
enclosed  by  the  curve  y  -  1/x,  the  x-axis  and  the  ordinates  x  =  1  and  x  =  n. 

and  this  means  that  we  have  an  estimate  for  S(n )  (namely,  In  n  +  0.5)  that  differs 
from  the  actual  value  by  no  more  than  0.5.  A  still  deeper  analysis  shows  that  for 
large  values  of  n ,  an  excellent  approximation  for  S(n )  is  In  n  +  0.577,  but  we  shall 
not  prove  this  result  here.  It  is  instructive,  however,  to  check  the  accuracy  of  this 
estimate.  Write  f{n )  for  In  n  +  0.577.  We  now  find  the  following: 


n  = 

10 

100 

1000 

10000 

100000 

S(n)  = 

2.92897 

5.18738 

7.48547 

9.78761 

12.0902 

m  = 

2.87959 

5.18217 

7.48476 

9.78734 

12.0899 

The  closeness  of  the  values  of  f(n)  and  S{ri)  for  large  values  of  n  is  striking.  (The 
constant  0.577  is  related  to  what  is  known  as  the  Euler-Mascheroni  constant.) 

In  general,  when  mathematicians  find  that  a  series  diverges,  they  are  also 
curious  to  know  how  fast  it  diverges.  That  is,  they  wish  to  find  a  function,  say  / (n), 
such  that  the  ratio  (£”  a  fi/ f{n )  tends  to  1  as  n  -»  oo.  For  the  harmonic  series  £  1  //, 
we  see  that  one  such  function  is  given  by  fin)  =  In  n.  This  is  usually  expressed  by 
saying  that  the  harmonic  series  diverges  like  the  logarithmic  function.  We  note  in 
passing  that  this  is  a  very  slow  rate  of  divergence,  because  In  n  diverges  more  slowly 
than  n£  for  any  e  >  0,  no  matter  how  small  e  is,  in  the  sense  that  \nn/n£  — ►  0  as 
n  — >  oo  for  every  e  >  0.  Obviously  the  function  In  In  n  diverges  still  more  slowly. 

EXERCISE.  Prove  that  if  a  >  1 ,  then  the  series 

1  1  1 

—  _l_  —  -j-  —  _j_ . . . 

\a  2a  3a 

converges.  (The  conclusion  holds  no  matter  how  close  a  is  to  1,  but  it  does  not  hold 
for  a  =  1  or  a  <  1,  a  curious  state  of  affairs!)  Further,  use  the  methods  of  integral 
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calculus  (and  the  fact  that  for  a  7M,  the  integral  of  \/xa  is  \  —  a)  to  show 

that  the  sum  of  the  series  lies  between  1  /{a  -  1 )  and  a/(a  -  1 ). 

The  fact  that  the  sum  1/1  +  1  /22  +  1  /3 2  +  •  •  •  is  finite  can  be  shown  in  another 
manner  that  is  both  elegant  and  elementary.  We  start  with  the  inequalities,  2~  > 
1  x  2,  32  >  2  x  3, 42  >  3  x  4, . . . ,  and  deduce  from  these  that 


1  1  1  1 

1  H - 1 - 1 - h 

22  32'  42 


<  1  + 


1 


+ 


1 


1x2  2x3 

The  sum  on  the  right  side  can  be  written  in  the  form, 


+ 


3x4 


+ 


which  (after  a  whole  feast  of  cancellations)  simplifies  to  1  +  1  / 1 ,  that  is,  to  2.  (This  is 
sometimes  described  by  stating  that  the  series  ‘telescopes’  to  2.)  Therefore  the  sum 
1  +  1  /22  +  1  /32  +  1  /42  +  •  •  •  is  less  than  2.  We  now  call  upon  a  theorem  of  analysis 
which  states  that  if  the  partial  sums  of  any  series  form  an  increasing  sequence  and 
are  at  the  same  time  bounded,  that  is,  they  do  not  exceed  some  fixed  number,  then 
they  possess  a  limit.  We  conclude,  therefore,  that  the  series  £  l//2  does  possess  a 
finite  sum  which  lies  between  1  and  2. 

The  divergence  of  the  harmonic  series  was  independently  proved  by  Johann  Berno¬ 
ulli  in  1689  in  a  completely  different  manner.  His  proof  is  worthy  of  deep  study,  as 
it  shows  the  counter-intuitive  nature  of  infinity. 

Bernoulli  starts  by  assuming  that  the  series  1/2  +  1/3  +  1/4  +  •  •  ■  (note  that  he 
starts  with  1/2  rather  1/1)  does  have  a  finite  sum,  which  he  calls  S .  He  now  proceeds 
to  derive  a  contradiction  in  the  following  manner.  He  rewrites  each  term  occurring  in 
N  thus: 

1  _  2  _  1  1  1  _  _3_  _  _1  _1_  1 

3  ~  6  ~  6  +  6  ’  4  "  12  _  72  +  15  +  12  ’ 

and  more  generally, 

1 _  n- 1  1  1  1 

n  n(n  —  1)  n(n  —  \)  n(n  —  1)  n(n  —  1)  ’ 


with  (n  —  1)  fractions  on  the  right  side.  Next  he  writes  the  resulting  fractions  in  an 
array  as  shown  below: 


1/6 

1/12 

1/20 

1/30 

1/42 

1/56 

1/6 

1/12 

1/20 

1/30 

1/42 

1/56 

1/12 

1/20 

1/30 

1/42 

1/56 

1/20 

1/30 

1/42 

1/56 

1/30 

1/42 

1/56 

1/42 

1/56 

1/56 

Note  that  the  column  sums  are  just  the  fractions  1  /2,  1  /3,  1/4,  1/5, ;  thus,  S  is  the 
sum  of  all  the  fractions  occurring  in  the  array.  Bernoulli  now  sums  the  rows  using 
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the  telescoping  technique  used  above  (see  equation  (4)).  Assigning  symbols  to  the 
row  sums  as  shown  below, 


he  finds  that: 


A 

B 

C 

D 


1111111 

—  -(-  —  +  —  -j-  —  +  —  -j-  —  -i_  —  _i_  . . . 

2  6  12  20  30  42  56  ’ 

111111 

—  -p  -  +  -  -|-  -  -4-  -  4-  -  _J_  .  .  . 

6  12  20  30  42  56  ’ 

_i_  '  1  1  1 

V2  +  20  +  30  +  42  +  56  +  '  "  ’ 

1111 
- 1 - 1 - 1 - L... 

20  30  42  56 


A  = 


+ 


+  •  •  • 


=  1, 


B 


C 

D 


1 

2’ 

1 

(arguing  likewise), 

1 

4’ 


and  so  on.  Thus  the  sum  A,  which  we  had  written  in  the  form  A+B+C+D+ •••, 
turns  out  to  be  equal  to 


Now  this  looks  disappointing — just  as  things  were  beginning  to  look  promising! 
We  seem  to  have  just  recovered  the  original  series  after  a  series  of  very  complicated 
steps.  But  in  fact  something  significant  has  happened:  an  extra  ‘1’  has  entered  the 
series.  At  the  start  we  had  defined  A  to  be  1/2  +  1/3  +  1/4  +  •  •  •;  now  we  find  that 

S  equals  1  +  1/2  +  1/3  +  1/4  +  •  •  •.  This  means  that  S  =  S  +  1.  However,  no  finite 

number  can  satisfy  such  an  equation.  Conclusion:  S  -  co! 

There  are  many  other  proofs  of  this  beautiful  result,  but  I  shall  leave  you  with  the 
pleasant  task  of  coming  up  with  them  on  your  own.  Along  the  way  you  could  set 
yourself  the  task  of  proving  that  each  of  the  following  sums  diverge: 

•  1/1  +  1/3+  1/5+  1/7+  1/9  H - ; 

•  1/1  +  1/11  +  1/21  +  1/31  +  1/41  +  •••; 

•  \/a+\/b+\/c+\/d+---,  where  a,  b,  c,  d, . . . ,  are  the  successive  terms  of 
any  increasing  arithmetic  progression  of  positive  real  numbers. 
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Elementary  Results 


The  next  result  that  we  shall  need  is  the  so-called  fundamental  theorem  of  arithmetic: 
every  positive  integer  greater  than  1  can  be  expressed  in  precisely  one  way  as  a 
product  of  prime  numbers.  We  shall  not  prove  this  very  basic  theorem  of  number 
theory.  For  a  proof,  please  refer  to  any  of  the  well-known  texts  on  number  theory, 
e.g.,  the  text  by  Hardy  and  Wright,  or  the  one  by  Niven  and  Zuckermann. 

We  shall  also  need  the  following  rather  elementary  results:  (i)  if  k  is  any  integer 
greater  than  1 ,  then 


1  _  1  1  1  1 
1-1  /k~  l  +  k  +^2  +^3  +^4  + 


(5) 


which  follows  by  summing  the  geometric  series  on  the  right  side,  and  (ii)  if  a/,  bj 
are  any  quantities,  then 


where,  in  the  sum  on  the  right,  each  pair  of  indices  (/,/)  occurs  precisely  once. 

Now  consider  the  following  two  equalities,  which  are  obtained  from  (5)  using  the 
values  k  =  2,  k  =  3: 


1 

1-1/2 

1 

1  -  1/3 


.  1111 

1  H - f - -| - 1 - h 

2  22  23  24 


1  +  l  +  h 


i  i 

H - -  H - t  T 

33  34 


We  multiply  together  the  corresponding  sides  of  these  two  equations.  On  the  left  side 
we  obtain  2  x  3/2  =  3.  On  the  right  side  we  obtain  the  product 

(1  +  1/2  +  1/22  +  1/23  +  •■•)  x  (1  +  1/3  +  1/32  +  1/33  +  •••). 

Expanding  the  product,  we  obtain: 


t  1  1  1 

1+2+4+8+ 

1  1  1 

+  6  +  12  +  24  +  ‘  ‘ 


1  1  1 
•  4-  —  T  —  -(-  —  -j-  • 

3  9  27 

1  1  1 

+  18  +  36  +  72  + 


that  is,  we  obtain  the  sum  of  the  reciprocals  of  all  the  positive  integers  that  have  only 
2  and  3  among  their  prime  factors.  The  fundamental  theorem  of  arithmetic  assures 
us  that  each  such  integer  occurs  precisely  once  in  the  sum  on  the  right  side.  Thus  we 
obtain  a  nice  corollary:  If  A  denotes  the  set  of  integers  of  the  form  2a  3b,  where  a 
and  b  are  non-negative  integers,  then 
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If  we  multiply  the  left  side  of  this  relation  by  (1  +  1/5  +  1/52  +  1/53  +  •  •  •)  and 
the  right  side  by  3/(1  -  1/5),  we  obtain  the  following  result: 

1  3  15 

1  -  1/5  =  T* 

zzB  ' 

where  B  denotes  the  set  of  integers  of  the  form  2a  3b  5C,  where  a ,  b  and  c  denote 
non-negative  integers. 

Continuing  this  line  of  argument,  we  see  that  infinitely  many  such  statements  can 
be  made,  for  example: 

•  If  C  denotes  the  set  of  positive  integers  of  the  form  2a  3b  5C  ld ,  where  a,  b,  c 

and  d  are  non-negative  integers,  we  then  have  1  /z  =  (15/4)(7/6)  = 

35/8. 

•  If  D  denotes  the  set  of  positive  integers  of  the  form  2a  3b  5C  ld  \\e ,  then 

Z,ed1A  =  (35/8)(11/10)  =  77/16. 

Infinitude  of  the  Primes 


Suppose  now  that  there  are  only  finitely  many  primes,  say  p\,P2,  P3, • . .  ,pn,  where 
p i  =  2,  p2  =  3,  p3  =  5, . . . .  We  consider  the  product 


1 


1 


1 


1 


1-1/21-1/31-1/5  1  -  \/pn 

This  is  obviously  a  finite  number,  being  the  product  of  finitely  many  non-zero  frac¬ 
tions.  Now  this  product  also  equals 


('  +  5  +  J  + 

'♦H- 


l  +  ;+5+ 


X 


+ 


When  we  expand  this  product,  we  find,  by  continuing  the  line  of  argument  developed 
above,  that  we  obtain  the  sum  of  the  reciprocals  of  all  the  positive  integers.  To  see 
why,  we  need  to  use  the  fundamental  theorem  of  arithmetic  and  the  assumption  that 
2, 3, 5, . . .  ,pn  are  all  the  primes  that  exist;  these  two  statements  together  imply  that 
every  positive  integer  can  be  expressed  uniquely  as  a  product  of  non-negative  powers 
of  the  n  primes  2,  3,  5, . . . ,  pn.  From  this  it  follows  that  the  expression  on  the  right 
side  is  precisely  the  sum 

1111 
T  +  2  +  3  +  4  +'“’ 

written  in  some  permuted  order.  But  by  the  Oresme-Bernoulli  theorem,  the  latter 
sum  is  infinite!  So  we  have  a  contradiction:  the  finite  number 

111  1 


1-1/21-1/31-1/5  1  -  \/Pn 

has  been  shown  to  be  infinite — an  absurdity!  The  only  way  out  of  this  contradiction 
is  to  drop  the  assumption  that  there  are  only  finitely  many  prime  numbers.  Thus  we 
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reach  the  desired  objective,  namely,  that  of  proving  that  there  are  infinitely  many 
prime  numbers. 

Note  that,  as  a  bonus,  there  are  several  formulas  that  drop  out  of  this  analysis, 
more  or  less  as  corollaries.  For  instance,  we  find  that 

111  111 
_ _ _ _ _ _ _  •  •  •  =  f  +  —  +  —  +  —  +  •••, 

1  -  1/22  1  -  1/32  1  -  1/52  22  32  4“ 

that  is,  the  infinite  product  and  the  infinite  sum  both  converge  to  the  same  (finite) 
value.  By  a  stunning  piece  of  reasoning,  including  a  lew  daring  leaps  that  would 
leave  today’s  mathematicians  gasping  for  breath,  Euler  showed  that  both  sides  of  the 
above  equation  are  equal  to  n2 / 6.  Likewise,  we  find  that 

111  111 
1  -  1/24  1  -  1/34  1  -  1/54  24  34  44 

and  this  time  both  sides  converge  to  /r4/90.  Euler  proved  all  this  and  much  much 
more;  it  is  not  for  nothing  that  he  is  at  times  referred  to  as  analysis  incarnate ! 


The  Divergence  of  2 1/p 


As  mentioned  earlier,  Euler  showed  in  addition  that  the  sum 


1111 

2+3+5+7+ 


is  itself  infinite.  We  are  now  in  a  position  to  obtain  this  beautiful  result.  For  any 
positive  integer  n  >  2,  let  Pn  denote  the  set  of  prime  numbers  less  than  or  equal  to  n. 
We  start  by  showing  that 


n 


i 

1-1  Ip 


(6) 


Our  strategy  will  be  a  familiar  one.  We  write  down  the  following  inequality  for 
each  p  £  Pn ,  which  follows  from  (5): 


1  ,  1  1 

- —  >  1  +  -  +  — 

1-1/p  p  p- 


1 

+  ~  + 
P 


1 

V 


The  >’  sign  holds  because  we  have  left  out  all  the  positive  terms  that  follow  the  term 
1  / pn .  Multiplying  together  the  corresponding  sides  of  all  these  inequalities  (p  e  Pn ), 
we  obtain: 


n 


i 

i  -  Up 


i  l  l 

-+—+—+ 
p  pz  p5 


+ 


When  we  expand  the  product  on  the  right  side,  we  obtain  a  sum  of  the  form  YjjeA  1  / J 
for  some  set  of  positive  integers  A.  This  set  certainly  includes  all  the  integers  from  1 
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to  n  because  the  set  Pn  contains  all  the  prime  numbers  between  1  and  n.  Inequality 
(6)  thus  follows  immediately. 

Next,  we  already  know  (see  equation  (3))  that 


V  1  i  1  i 

>  -  >  In  n  H —  >  In  n. 
~  j  n 


(7) 


;= i 


Combining  (6)  and  (7),  we  obtain  the  following  inequality: 


nrr1 

ptPn 


l  /p 


>  In  n. 


Taking  logarithms  on  both  sides,  this  translates  into  the  statement 


X'"(^ 

ptPn  X 


i  Ip 


>  In  In  n. 


(8) 


Our  task  is  nearly  over.  It  only  remains  to  relate  the  sum  YjPePn  ^/P  sum  on 

the  left  side  of  (8).  We  accomplish  this  by  showing  that  the  inequality 


lx 

T 


>  In 


1 

1  —  x 


(9) 


holds  for  0  <  x  <  1  /2. 

To  see  why  (9)  is  true,  draw  the  graph  of  the  curve  T  whose  equation  is  y  = 
ln(  1/(1  -  x)),  over  the  domain  -oo  <  x  <  1,  (see  Figure  5.2).  Note  that  T  passes 
through  the  origin  and  is  convex  over  its  entire  extent.  (PROOF:  Write  /(x)  =  -  In 
(1  -  x);  then  f\x)  =  1/(1  -  x)  and  /"(x)  =  1/(1  -  x)2  >  0  for  all  x  <  1.) 


Figure  5.2  The  graph  shows  that  for  0  <  x  <  1/2,  we  have  (2  In  2)  x  >  In 
(1/(1  -  x))  for  0  <  x  <  1/2. 
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The  convexity  of  F  implies  that  the  chord  joining  the  points  A(0,0)  and 
B(l/2,  In 2)  lies  completely  above  the  curve.  The  equation  of  AB  is  y  =  (2  In  2)  x, 
so  that  over  the  range  0  <  x  <  1/2  we  have  the  inequality: 

(2  In  2)x  >  In  ^  ^  . 


Since  In  2  ~  0.69315  <  0.7  =  7/10,  (9)  follows. 
Inequality  (9)  implies  that 


x  >  -  In 


1  -  x 


for  x  =  1  /2,  x  =  1  /3,  x  =  1/5, .. .  .  Therefore,  by  addition, 


/ 


2> 


1 


\ 


i-i  Ip 


Combining  (8)  and  (10),  we  deduce  that 


>  -  In  In  n. 


(10) 


As  n  ->  oc,  the  right  side  diverges  to  infinity,  therefore  so  does  the  left  side;  so  we 
reach  our  desired  objective,  that  of  showing  the  divergence  of  JT  1/p,-. 


An  Alternative  Proof 

Here  is  an  alternative  proof  of  the  claim  that  JL  1  /p,-  diverges.  The  proof  has  been 
written  in  an  ‘old-fashioned’  style  and  purists  will  protest.  Nevertheless,  we  shall 
present  the  proof  and  let  the  readers  judge  for  themselves.  Let  5  denote  the  sum 
Yji  1  / Pi-  We  shall  make  use  of  the  following  result: 

ex  >  1  +  x  for  all  real  values  of  x, 


with  equality  holding  precisely  when  x  =  0.  The  graphs  of  ex  and  1  +  x  show  why 
this  is  true;  the  former  graph  is  convex  over  its  entire  extent  (examine  the  second 
derivative  of  ex  to  see  why),  while  the  latter,  a  line,  is  tangent  to  the  former  at  the 
point  (0,  1),  and  lies  entirely  below  it  everywhere  else.  Substituting  the  values  x  = 
1/2, x  =  1/3, x  =  1/5,...,  successively  into  this  inequality,  we  find  that 


el/2>l+l-_,  e1/3  >1  +  2,  el/S>l  +  I,.... 

Multiplying  together  the  corresponding  sides  of  these  inequalities,  we  obtain: 


e 


S 


> 


! 
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The  infinite  product  on  the  right  side  yields  the  following  series: 


t  111111  1  1  1  1 
1  +  2  +  3  +  5  +  6  +  7  +  T0  +  TT  +  l3+14  +  T5  + 


This  series  is  the  sum  of  the  reciprocals  of  all  the  positive  integers  whose  prime 
factors  are  all  distinct;  equivalently,  the  positive  integers  that  have  no  squared  factors. 
These  numbers  are  sometimes  referred  to  as  the  quadratfrei  or  square-free  numbers. 
Let  Q  denote  this  sum.  We  shall  show  that  this  series  itself  diverges,  in  other  words, 
that  Q  =  oo.  This  will  immediately  imply  that  S  =  oo  (for  es  >  Q ),  and  Euler’s 
result  will  then  follow. 

We  consider  the  product 


Qx(I  +  ?  +  ?  +  i  +  '")- 

This  product,  when  expanded,  gives  the  following  series: 

1111 
I  +  2  +  3  +  4  +  '“’ 

that  is,  we  obtain  the  harmonic  series.  To  see  why,  note  that  every  positive  integer 
n  can  be  uniquely  written  as  a  product  of  a  square-free  number  and  a  square;  for 
example,  1000  =  10  x  10i 2,2000  =  5  x  202, 1728  =  3  x  242,  and  so  on.  Now  when 
we  multiply 


A  1  1  1  1  1  1  1  1  1.1 

(1  +  2  +  3  +  5  +  6  +  7  +  To  +  IT+B  +  T4  +  T5  + 


with 

(x  +  b  +  h  +  b+") 

we  find,  by  virtue  of  the  remark  just  made,  that  the  reciprocal  of  each  positive  integer 
n  occurs  precisely  once  in  the  expanded  product.  This  explains  why  the  product  is 
just  the  harmonic  series.  Now  recall  that  the  sum 


i  1  1  1 

1  H — rH — r  H — r  + 

22  32  42 


is  finite  (indeed,  we  have  shown  that  it  is  less  than  2).  It  follows  that 


Q  x  (some  finite  number)  =  oo. 

Therefore  Q  —  oo,  and  Euler’s  result  (JT  1  / Pi  —  °°)  follows.  QED! 

Readers  who  are  unhappy  with  this  style  of  presentation,  in  which  oo  is  treated  as 
an  ordinary  real  number,  will  find  it  an  interesting  (but  routine)  exercise  to  rewrite 
the  proof  to  accord  with  more  exacting  standards  of  rigour  and  precision. 
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XVI.  Summa  seriei  infinite  harmonice  progressionalium,  y  +  2  +  5  +  5  +  5  & 
c.est  infinita. 

Id  primus  deprehendit  Frater:  inventa  namque  per  praeced.  Summa 

seriei  J>  +  ^  +  T2  +  2fi  +  3^&c-  visurus  Porr5’  Ouid  emerSeret  ex  ista 
serie,  \  +  \  +  ^  ^  &  c..  Si  resolveretur  methodo  Prop.  XIV. 

collegit  p  opositionis  veritatem  ex  absurditate  manifesta,  quae  sequeretur,  si 
summa.  Seriei  harmonicae  finita  statueretur.  Animadvertit  enim, 

Seriem  A,  y  +  y  +  y  +  ^  +  ^4-  ^,&c.oo.  (fractionibus  singulis  in  alias, 
quarum  numerators  sunt  1,  2,  3,  4,  &  c.  transmutatis) 

Seriei  B,  ^  +  |  +  ^  +  ^  +  ^  +  &  c.ooC  +  D  +  E  +  F,  &  c. 


c-5  +  S  +  T2  +  25  +  3i3  +  47’  &c.  coperpraec.f 


1 


1 


1  V 


D  •  •  +  z  +  -j2  +  ^"t_T7'd"j;’’  &c-°°C-^oo9 


E  •  • 
F  •  • 


+ 


12 


20 

+  + 


30  '  42 


1 


20  1  30  '  42 


+  -jB  &c.  00  D  -  i  00^ 


"b  on  F  on  "b  ,  &C.  00  E  ,  OO  * 


1 


20  '  30  '  42 


&C.  00 


&c. 


>  ooG;  unde 


y 


sequitur,  seriem  Goo  A,  totum  parti,  si  summa  finita  effet.  Ego 


Johann’s  divergence  proof,  from  Jakob’s  Tractatus  de  Seriebus  Infinitis,  republished  in  1713. 
(From  page  197  of  Journey  through  Genius  by  William  Dunham.) 


Conclusion 

A  much  deeper — but  also  more  difficult — analysis  shows  that  the  sum  1  /pi  -b  1  //?2  + 
1/P3  +  •  •  •  +  1  / pn  is  approximately  equal  to  In  In  n.  This  is  usually  stated  in  the 
following  form:  as  n  tends  to  oo,  the  fraction 

\/p\  +  \/p2  +  1/P3  +  •  •  •  +  1  /Pn 
In  In  n 

tends  to  1.  This  is  indeed  a  striking  result,  reminiscent  of  the  earlier  result  that  1/1  + 
1/2  +  1/3  +  •  •  •  +  1/n  is  approximately  equal  to  In  n.  It  shows  the  staggeringly 
slow  rate  of  divergence  of  the  sum  of  the  reciprocals  of  the  primes.  The  harmonic 
series  1//,  diverges  slowly  enough — to  achieve  a  sum  of  over  100,  for  instance, 
we  would  need  to  add  more  than  1043  terms,  so  it  is  certainly  not  a  job  that  one  can 
leave  to  finish  off  over  a  weekend.  (Do  you  see  where  the  number  1043  comes  from?) 
On  the  other  hand,  to  achieve  a  sum  of  over  100  with  the  series  JE  1/p/,  we  need  to 

add  something  like  101(j  terms!!  This  number  is  so  stupendously  large  that  it  is  a 
hopeless  task  to  make  any  visual  image  of  it.  Certainly  there  is  no  magnitude  even 
remotely  comparable  to  it  in  the  whole  of  the  known  universe. 
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On  Fermat’s  Two  Squares  Theorem 


Shailesh  A  Shirali 

Introduction 

The  purpose  of  this  chapter  is  to  present  a  proof  of  the  two  squares  theorem:  every 
prime  of  the  form  1  (mod  4)  can  be  written  as  a  sum  of  two  squares.  The  theorem  was 
first  stated  by  Fermat  (as  usual,  with  no  proof!)  and  later  proved  by  Euler.  The  proof 
given  here  is  an  elaboration  of  the  one  presented  by  Don  Zagier  in  a  crisp  note  that 
appeared  in  The  American  Mathematical  Monthly,  Vol.  97,  #  2  (Feb  1990).  As  Zagier 
himself  remarks  in  his  paper,  his  proof  is  not  constructive.  In  the  final  section  we 
make  an  interesting  conjecture  which,  if  correct,  will  provide  a  constructive  version 
of  Zagier’s  proof. 

Throughout,  p  refers  to  a  fixed  prime  of  the  form  1  (mod  4),  while  N  refers  to  the 
set  of  positive  integers.  For  a  finite  set  X,\X\  denotes  the  cardinality  of  X. 


Proof  of  the  Two  Squares  Theorem 

The  proof  hinges  on  a  study  of  the  solutions  in  positive  integers  of  the  equation 
*  4-  4 yz  =  p.  Let  Sp  denote  the  solution  set: 

Sp  =  {(x,y,z)  zN3  :  x2  +  4yz  =  p}.  (1) 

It  is  easy  to  verify  that  Sp  is  non-empty  (for  (1,1,  L-ii)  e  Sp)  and  finite.  We  shall 
show  that  \Sp\  is  odd. 

Consider  the  following  relations: 

X2  +  4 yz  =  (x  +  2 z)2  +  4 Z{y  -  x  -  z)  =  (2y  -  x)1  +  4y(x  -  y  +  z).  (2) 

From  this  we  see  that  a,  (3,  y  as  defined  by 

a(x,  y,  z)  =  (x  +  2z,  z,  y  -  x  -  z), 
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(3) 
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P(x>  T.  Z)  =  (2y  -  x,  y,  x  -  y  +  z),  (4) 

r(x,  y,  Z)  -  (x  -  2y,  X  -  y  +  z,  y),  (5) 

are  maps  of  the  solution  set  in  real  numbers  of  x2  +  4 yz  =  p  into  itself;  still  bet¬ 
ter,  they  are  unimodular  maps  —  they  permute  the  integer  solutions  amongst  them¬ 
selves.  (This  can  be  checked  by  observing  that  the  matrices  corresponding  to  the 
three  maps  are  all  unimodular,  that  is,  they  have  determinant  ±1.)  Since  our  interest 
lies  chiefly  in  the  positive  integral  solutions,  we  define  subsets  Ap,  Bp  and  Cp  of  Sp  as 
follows: 


Ap  =  {(x,y.,z)  £  Sp,x  <  y  -  z],  (6) 

Bp  =  {(x,y,z)  £  Sp,y  -  z  <  x  <2y),  (7) 

Cp  =  {(x,y,z)  e  Sp,2y  <  x).  (8) 

We  now  make  the  following  observations  which  are  easy  to  verify. 

•  Sp  =  Ap  U  Bp  U  Cp,  that  is,  Ap,  Bp  and  Cp  constitute  a  partition  of  Sp. 
Equality  cannot  hold  in  any  of  the  defining  inequalities  because  p  is  prime. 
Moreover,  (1,1,  -p-)  €  Bp. 

•  a  maps  Ap  into  Cp  and  y  maps  Cp  into  Ap\  moreover,  a  and  y  are  inverses  of 
one  another.  Since  Ap  and  Cp  are  finite  sets,  it  follows  that  \AP\  =  \Cp\. 

•  P  maps  Bp  into  itself,  and  p  is  its  own  inverse  (it  is  an  involution),  so  it  pairs 
up  elements  of  Bp  with  one  another,  except  possibly  for  the  fixed  points — 
the  triples  (x,  y,  z)  which  get  mapped  to  themselves;  these  have  no  mates  and 
stand  alone. 

•  p  has  just  one  fixed  point.  For,  if  (x,  y,  z)  is  a  fixed  point,  then  (2 y  -  x,  y, 
x  -  y  +  z)  =  (x,  y,,z),  so  x  =  y.  This  gives  p  =  x(x  +  4 z),  implying  that  x  =  1 
and  x  +  4 z  =  p  since  p  is  prime.  It  follows  that  (1,1,  ^p-)  is  the  sole  fixed 
point  of  p. 

•  Bp  is  odd,  for  p  is  an  involution  on  Bp  with  just  one  fixed  point.  In  turn  this 
implies  that  \Sp\  is  odd  (because  \Ap\  =  \CP\). 

Observe  that  for  each  element  (x,y,z)  €  Sp,  its  ‘mate’  (x,  z,  y)  also  lies  in  Sp. 
Since  Sp  has  an  odd  number  of  elements,  it  follows  that  Sp  must  contain  an  ‘odd  man 
out’  which  is  its  own  ‘mate’.  If  (r,  5,  s)  is  such  an  element  of  Sp,  then  p  =  r2  +  (2s)2, 
and  we  are  through. 


Towards  a  Constructive  Proof 

Note  that  the  proof  presented  is  not  constructive — it  provides  no  clue  as  to  how  the 
desired  (r,  s)  can  be  computed  for  a  given  p.  (Curiously,  this  is  true  for  most  known 
proofs  of  the  theorem.)  However,  the  argument  used  does  suggest  the  possibility  of 
an  algorithmic  proof.  I  have  empirically  found  that  the  following  algorithm  ‘works’, 
in  the  sense  that  it  always  seems  to  terminate.  However,  I  have  not  been  able  to  devise 
a  proof  of  termination;  if  found,  then  a  constructive  proof  of  the  two  squares  theorem 
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is  at  hand.1  Perhaps  some  reader  would  like  to  take  up  the  challenge  and  settle  the 

matter.  . 

Consider  the  set  Ip  of  integer  triples  (x,  yt  z)  for  which  x  +  4 yz  =  P •  The  set  is 

non-empty,  for  (1,1,  ^p)  €  Ip.  Our  objective  is  to  find  a  triple  in  Ip  of  the  form 

(r,  5,  5);  this  would  immediately  provide  the  desired  representation  of  p  as  a  sum  of 
two  squares  (p  =  r2  +  (2s)2).  Towards  this  end  we  define  a  function  /:  Ip  Ip  as 

follows: 

f  (x  +  2z,  y  -  z  -  x,  z)  if  z  +  x  <  y, 

/(*,  y,  Z)  -  |  (2>;  _  x,  z  +  x  -  y,  y)  if  z  +  x  >  y. 


Example.  Let  p  -  17;  then  /(l,  1,4)  =  (1,4, 1)  and  /( 1,4, 1)  -  (3,2, 1). 

We  now  compute  the  orbit  of  the  triple  (1,1,^)  under  action  by  /.  If  at  some 
stage  we  reach  a  triple  of  the  form  (/*,  s,  s)  we  terminate  the  computation.  The  curi¬ 
ous  thing  is  that  we  always  seem  to  reach  such  a  triple.  Listed  below  are  the  initial 
segments  of  the  orbits  for  a  few  p’s.  In  each  case  we  stop  when  the  desired  triple  is 

reached. 

•  p  =17 

(1,  1,4)  (1,  4,  1),  (3,  2,  1),  (1,  2,  2);  result:  17  =  l2  +  42. 

•  E  Z  29 

(1,  1,  7),  (1,  7,  1),  (3,  5,  1),  (5,  1,  1);  result:  29  =  52  +  22. 

•  p  =  41 

(1,  1,10),  (1,  10,  1),  (3,  8,  1),  (5,  4,  1),  (3,  2,  4),  (1,  5,2  ), 

(5,  2,  2);  result:  41  =  52  4-  42. 

•  p  =  53 

(1,  1,  13),  (1,  13,  1),  (3,  11,  1),  (5,  7,  1),  (7,  1,  1); 
result:  53  =  72  +  22. 

•  p  =  109 

(1,  1,  27),  (1,  27,  1),  (3,  25,  1),  (5,  21,  1),  (7,  15,  1),  (9,  7,  1), 

(5,  3,  7),  (1,  9,  3),  (7,  5,  3),  (3,  5,  5);  result:  109  =  32  +  102. 

Any  takers? 


Further  Remarks 

•  Weil  writes,  in  his  book  (see  Suggested  Reading)  that  “all  known  proofs  begin 
...  by  showing  that  -1  is  a  quadratic  residue  of  p  =  An  +  1”.  This  being  so, 
Zagier’s  proof  is  rather  atypical. 


This  conjecture  was  subsequently  settled  in  the  affirmative  by  B  Bagchi;  see  Chapter  7. 
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The  theorem  was  stated  by  Fermat  in  1640;  he  never  published  any  proof  but 
in  all  likelihood  did  possess  one,  probably  based  on  the  principle  of  infinite 
descent  (which  itself  is  one  of  Fermat’s  inventions).  The  first  published  proof, 
by  Euler,  appeared  in  the  1740’s;  it  too  uses  the  principle  of  infinite  descent. 


Suggested  Reading 

Andre  Weil.  Number  Theory:  An  Approach  Through  History ,  1984. 
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B  Bagchi 

The  Two  Squares  Theorem 

Throughout  this  article,  p  is  a  prime  such  that  p  =  1  (mod  4).  IN  and  Z  will  denote, 
as  usual,  the  set  of  all  natural  numbers  (excluding  zero)  and  the  set  of  all  integers 
(positive,  negative  or  zero),  respectively.  Recall  that  the  celebrated  two  squares  the¬ 
orem  (first  stated  by  Fermat  and  proved  by  Euler)  says  that  p  can  be  written  as  a  sum 
of  two  perfect  squares.  Clearly  one  of  these  two  squares  must  be  even  (and  the  other 
one  is  odd).  Therefore,  this  theorem  may  be  formulated  by  saying  that  there  exists 
(x,y)  e  IN  x  IN  such  that  x 2  +  4 y2  =  p.  Any  such  pair  (x,  y)  will  be  referred  to  as 
a  representation  of  p.  (Actually,  as  is  well  known,  the  representation  is  unique.  For 
proof,  see  for  instance  Niven  and  Zuckerman  in  Suggested  Reading.) 


Permutations 

G  H  Hardy  writes  that  the  two  squares  theorem  ‘is  ranked,  very  justly,  as  one  of  the 
finest  in  arithmetic’.  So  it  comes  as  a  surprise  to  learn  that  its  finest  proof  was  found 
only  in  1990.  In  that  year,  D  Zagier  modified  a  proof  of  the  two  squares  theorem  due 
to  Heathbrown  to  create  a  remarkably  short  and  elegant  proof.  Although  Zagier’s 
proof  was  presented  in  detail  by  Shirali  in  Resonance  (see  Suggested  Reading),  we 
shall  begin  with  a  brief  account  of  this  proof.  To  do  so,  we  need  to  recall  some  facts 
about  permutations. 

If  A  is  a  finite  set,  then  by  a  permutation  of  X  we  mean  a  function  from  X  into 
itself  under  which  each  element  of  X  has  a  unique  pre-image.  If  k  and  o  are  any  two 
permutations  of  A,  then  we  can  form  their  ‘product’  ko  by  composition:  ko(x)  :  = 
7r(cr(x)),  x  in  A.  If  A  is  of  size  n,  there  are  only  n\  permutations  of  A  and  they  form 
a  group  with  this  product  rule.  (Though,  strictly  speaking,  we  need  no  group  theory 
for  this  article,  familiarity  with  the  elements  of  this  theory  will  still  be  useful.)  Since 
we  have  defined  the  product  of  any  two  permutations,  in  particular  we  can  form  the 
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powers  n  —  /r  ,  /r  , . . . ,  of  any  given  permutation  it.  Since  there  are  only  finitely 
many  distinct  permutations  of  X,  some  two  of  the  powers  of  it  must  actually  be 
equal.  By  cancellation,  it  follows  that  there  must  exist  a  natural  number  m  such  that 
7rm  is  the  identity  permutation  id  fixing  all  elements  of  X .  The  smallest  such  number 
is  called  the  order  of  it.  A  permutation  of  order  two  is  called  an  involution. 

Any  permutation  it  of  X  breaks  up  (‘partitions’)  X  into  one  or  more  parts  such 
that  two  elements  x  and  y  of  X  are  in  the  same  part  if  and  only  if  some  power  of  it 
takes  x  to  y.  These  parts  are  called  the  orbits  of  it.  The  singleton  orbits  are  just  the 
fixed  points  of  n.  A  permutation  of  X  is  said  to  be  transitive  on  X  if  it  has  only  one 
orbit  (namely,  the  whole  of  X). 

It  is  easy  to  convince  oneself  that  the  size  of  any  orbit  of  a  permutation  divides  the 
order  of  the  permutation.  In  particular,  if  the  permutation  it  has  prime  order  q,  then 
(as  1  and  q  are  the  only  divisors  of  q)  each  orbit  is  either  a  fixed  point  or  has  size  q. 
It  follows  that,  in  this  case,  the  number  of  fixed  points  of  it  is  congruent  modulo  q 
to  the  size  n  of  X.  Hence  it  has  a  fixed  point  if  n  is  not  a  multiple  of  q.  As  a  special 
case  (q  =  2)  of  this  observation,  we  see  that  an  involution  of  X  has  a  fixed  point  in 
X  if  X  is  an  odd  set  (i.e.,  the  number  of  elements  of  X  is  odd).  This  is  the  key  fact 
which  makes  Zagier’s  proof  (and  its  constructive  versions  presented  here)  work. 


Zagier’s  Proof 

Now  we  come  to  Zagier’s  proof.  Let  S  denote  the  subset  of  IN  x  IN  x  IN  defined  by 

S  =  {(x,  y,  z)  e  IN  x  IN  x  IN  :  x2  +  4 yz  =  p } . 

Clearly  S  is  a  finite  set.  Zagier  defines  two  involutions  a  and  (3  of  S  by 

{(x  +  2z,  z,  y  -  X  -  z)  if  X  <  y  -  Z, 

(2y  -  x,  y,  x  +  z  ~  y)  if  y  -  Z  <  x  <  2 y, 

(x  -  2y,  X  +  Z  ~  y,  y)  if  x  >  2 y. 

p{x,y,z)  =  (x,z,y). 


The  involution  a  of  the  finite  set  S  has  a  unique  fixed  point  (namely  (1,1,  ^— )).  It 
follows  that  S  is  an  odd  set.  Therefore,  the  involution  (3  of  the  odd  set  S  must  have 
an  odd  number  (hence  at  least  one)  of  fixed  points  in  S.  But  (x,  y)  (x,  y,  y)  is  a 
bijection  of  the  set  of  representations  of  p  onto  the  set  of  fixed  points  of  (3.  Hence 
p  has  at  least  one  representation  (as  a  sum  of  two  squares).  This  completes  Zagier’s 
proof  of  the  two  squares  theorem. 


Shirali’s  Conjecture 

Zagier  notes  in  his  paper  that  his  proof  ‘is  not  constructive:  it  does  not  give  a  method 
to  actually  find  the  representation  of  p  as  a  sum  of  two  squares’.  Perhaps  provoked 
by  this  statement,  S  A  Shirali  gave  a  conjectural  way  to  ‘constructivize’  this  proof. 
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Shirali’s  conjecture  may  be  phrased  as  follows.  Define  a  finite  subset  S  of  Z  x  IN 
xIN  by 

S  =  {(x,  y,  z)£ZxINxJN:x  +  y>z  and  x2  +  4  yz  =  p)  ■ 

/s  /\ 

Define  a  function  y  :  S  — ►  S  by 


v  f  (x  +  2z,  y  — x  -  z,  z)  if  x  +  z<y, 

y  A’ y' z  ”  \  (2y  -  x,  x  +  z  -  y,  y)  if  x  +  z  >  y- 

Then,  Shirali  conjectures  that  the  orbit  of  the  point  (1,  ,  1)  under  y  contains  a 

point  of  the  form  (x,  y,  y).  That  is,  to  obtain  a  point  (x,  y,  y)  £  S  (and  hence  a  square 

plus  square  representation  of  p),  begin  with  the  point  (1,  ~p-,  1)  and  look  at  the 
successive  iterates  (powers)  of  y  on  this  point  until  a  point  (x,  y,  y)  is  obtained. 

(Actually,  Shirali  defines  his  function  on  the  (infinite)  set  of  all  points  (x,  y,  z) 
in '  Z  x  Z  x  Z  satisfying  x2  +  4yz  =  p,  and  proposes  to  begin  with  the  a-fixed 

_  j  ✓‘S 

point  (1,1,  £-£—).  However,  we  observed  that  this  function  fixes  the  finite  subset  S 
introduced  above  and  on  this  subset  restricts  it  to  y  as  defined.  Though  the  a-fixed 
point  itself  does  not  belong  to  this  subset,  its  image  under  Shirali’s  original  function 
is  (1,  £■—•,  1),  which  does  belong.  Therefore,  our  formulation  of  the  conjecture  is 
entirely  equivalent  to  Shirali’s  original  formulation.) 


A  Constructive  Version  of  Zagier’s  Proof 

Notice  that  the  function  y  is  a  ‘perturbation’  of  the  permutation  y  :=  a  (I  of  S  obtained 
by  composing  Zagier’s  involutions  a  and  /?.  So  it  is  natural  to  ask  if  Shirali’s  con¬ 
jecture  is  valid  with  y  replaced  by  y.  In  the  following  theorem,  we  show  that  this 
modified  conjecture  is  indeed  correct.  Note  that  we  now  stay  within  the  set  S ,  and 
this  is  closer  to  Zagier’s  original  proof. 

THEOREM.  Let  k  denote  the  size  of  the  orbit  T  under  y  :=  aft  which  contains  the 
a-fixed  point  a.  Then  k  is  odd;  T  contains  a  unique  /Lfixed  point  b  and  is  given 
by  the  formula  b  =  y^k~l^2(a).  In  fact,  the  orbit  T  satisfies  the  symmetry  relation 
yk~l~n(a )  =  /3(yn(a))  for  0  <  n  <  k  -  1. 

Thus,  to  obtain  a  /7-fixed  point  (x,  y,  y)  (and  hence  the  representation  p  =  x2 
+  (2y)2),  begin  with  the  a-fixed  point  and  iterate  a/?  on  it;  in  a  finite  number  of  steps 
you  will  reach  a  /Mixed  point.  This  theorem  shows  that  exactly  half  the  orbit  has  to 
be  traversed  before  this  point  is  reached;  the  remaining  half  may  be  found  (in  reverse 
order)  simply  by  applying  (1  to  the  first  half. 

PROOF.  Since  a  and  p  are  involutions,  a  ‘normalises’  y  ;  aya-1  =  pa  =  y-1. 
Therefore,  a  maps  the  orbits  of  y  to  orbits  of  y.  (To  see  this,  let  s\  and  s2  be  two 
points  from  a  common  y-orbit.  By  definition,  this  means  that  there  is  an  integer  Q 
such  that  y0{s\)  =  j2-  Then  a(j2)  =  aye(s\)  =  ay0  (a(si ))  =  y_<?(a(s|).) 
Thus,  whenever  s\  and  S2  in  S  are  from  a  common  y-orbit,  a(si)  and  a(s2)  are  also 
in  a  common  y-orbit.  So  the  image  under  a  of  any  y-orbit  is  again  a  y-orbit.)  In 
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particular,  it  T  is  the  orbit  under  y  which  contains  the  fixed  point  a  of  a ,  then  a(T)  is 
an  orb.t  which  meets  the  orbit  T  in  this  fixed  point,  hence  we  must  have  a{T)  =  T . 
Since  the  restriction  of  a  to  T  is  an  involution  of  T  with  a  unique  fixed  point,  it 
follows  as  before  that  T  is  an  odd  set.  Since  both  a  and  y  fix  T,  so  does  (3  —  ay.  Thus 
the  restriction  to  T  of  (3  is  an  involution  of  the  odd  set  T,  and  hence  (3  must  have  a 
fixed  point  b  in  T.  So  there  is  an  F,  0  <  F  <  k  —  1,  such  that  b  =  y^(a)  is  fixed  by 
(3.  To  prove  the  uniqueness  of  this  fixed  point,  it  suffices  to  show  that  k  =  2$  +  1  is 
forced  on  us. 

For  m  G  Z,  we  have  (3{ym(b ))  =  (3ym j3~\(3(b))  =  y~m(b).  Substituting  y^(a)  for 
b,  we  find  that  the  orbit  T  has  a  two-fold  symmetry  around  its  #th  term: 

ye+m(a)  =  P{y0~m{a))  Mm  €  Z. 

In  particular,  taking  m  =  8  +  1  in  this  identity,  we  get  y^e+i(a)  =  /?y-I(a) 
=  (3  a  (a)  =  a  (a)  =  a.  From  the  definition  of  k,  one  sees  that  an  integer  h  satis¬ 
fies  yh(a)  =  a  iff  h  is  an  integral  multiple  of  k.  Since  h  —  2^+1  satisfies  this 
condition,  2F  +  1  is  a  multiple  of  k.  Since  1  <  2Q  4-  1  <  2k,  this  forces  2^+1  =  k. 
Finally,  substituting  Q  -  -  n  in  the  displayed  identity,  we  get  the  last 

assertion  of  the  theorem. 


Shirali’s  Conjecture  Vindicated 

a  /X 

Define  the  involutions  a  and  [3  of  the  finite  set  S  as  follows: 


d{x,  y,z )  =  {2z~  x,x  +  y  -  z,  z), 

(~x,  y,  z)  if  x  +  z  <  y, 
(x,  z,y)  if  x  +  z  >  y. 


fi(x,y,z)  = 


One  readily  verifies  that  (i)  these  are  indeed  involutions  of  S,  (ii)  a  has  a  unique 

_  j 

fixed  point,  namely  a  :=  (1,  ,  1),  and  (x,y)  (x,  y,  y)  is  a  bijection  from  the 

A 

representations  of  p  onto  the  fixed  points  of  (3.  Thus,  in  Zagier’s  proof,  one  may 
replace  a,  (3  and  S  by  a,  (3  and  S,  respectively.  Finally,  Shirali’s  function  y  is  related 

-A. 

to  these  involutions  by  y  =  a(3.  Therefore,  the  indicated  substitutions  in  the  proof  of 
the  above  theorem  yields  a  ‘hatted’  version  of  the  theorem.  In  particular,  this  proves 
Shirali’s  conjecture. 


Uniqueness  of  the  Square  Plus  Square 
Representation  of  p 

Aside  from  being  non-constructive,  Zagier’s  proof  has  another  shortcoming.  As 
already  mentioned,  the  prime  p  has  a  unique  representation  as  a  sum  of  two  squares. 
Or,  what  amounts  to  the  same  thing,  (3  also  has  a  unique  fixed  point  in  S.  But  this  does 
not  emerge  from  Zagier’s  proof  (or  from  its  constructive  variations  given  above).  We 
are  unable  to  remedy  this  defect.  Notice,  however,  that  in  view  of  the  uniqueness 


38  Number  Theory 


assertion  in  the  above  theorem,  it  would  suffice  to  show  that  y  acts  transitively  on  S. 
(For,  this  would  mean  that  T  =  S,  and  we  know  that  (3  has  a  unique  fixed  point  in 
T .)  Computations  by  hand  show  that  this  is  indeed  correct  for  primes  below  hundred. 
One  might  therefore  be  tempted  to  conjecture  that,  generally,  y  acts  transitively  on 
S.  If  correct,  this  would  provide  a  neat  explanation  for  the  uniqueness  of  the  /Lfixed 
point.  Unfortunately,  this  conjecture  is  incorrect.  Its  validity  for  small  primes  turns 
out  to  be  yet  another  instance  of  the  ‘strong  law  of  small  numbers’.  (If  you  have 
never  heard  of  this  law  then  you  are  urged  to  take  a  look  at  the  beautiful  article  by 
Guy[l].) 

We  see  this  as  follows. 

For  each  fixed  x,  the  number  of  points  in  S  with  the  given  first  coordinate  equals 

_  2 

d Therefore  we  have  the  formula 

X  '  ' 

where  the  sum  is  over  all  odd  numbers  x  in  the  range  1  <  x  <  yfp.  (Here  </(•)  is  the 
usual  divisor  function:  for  n  e  lN,d(n)  is  the  number  of  divisors  of  n  including  1 
and  n.) 

Let  p  be  of  the  form  k2  +  4  (for  an  odd  number  k).  Then,  in  the  iterates  under  y 
of  the  point  a  =  (1,1,  the  first  coordinate  increases  in  steps  of  two  until  the 

point  b  =  (k,  1,  1)  is  reached,  then  it  decreases  in  steps  of  two  until  we  reach  the  end 
point  (1,  ^4,  1)  of  the  orbit.  This  shows  that  in  this  case,  the  size  k  of  the  orbit  T 
is  related  to  the  prime  p  by  p  =  k2  +  4.  Also,  the  sum  in  the  formula  for  #(S)  given 
above  has  (k  +  1  )/2  terms  of  which  one  term  equals  1  while  the  remaining  (k—  1  )/2 
terms  are  >2.  Since  d(ri)  -  2  iff  n  is  a  prime,  it  follows  that  for  a  prime  of  the  form 
p  =  k2  +  4,  y  is  transitive  on  S  (i.e.,  k  =  #(S))  iff  (p  -  x2)/4  is  a  prime  for  all  odd 
numbers  x  in  the  range  1  <  x  <  k.  This  shows,  for  instance,  that  we  do  not  have 
transitivity  for  p  =  229. 


Inefficiency  of  the  Algorithm 

Clearly,  the  a-(3  algorithm  needs  at  most  ^#(5)  steps.  Since  d(n)  =  0{n£ )  and 

the  formula  for  #(£)  has  O(pi)  terms  in  it,  the  number  of  necessary  iterations  is 

0(p2+e).  The  example  of  primes  of  the  form  square  plus  four  (presumably  there  are 
infinitely  many  such  primes)  shows  that  this  estimate  is  close  to  the  best  possible. 
Wagon  describes  known  algorithms  whose  complexity  is  polynomial  in  log  p,  and 
the  a-f  algorithm  compares  very  unfavourably  (see  Suggested  Reading).  But  it  may 
be  that  we  have  looked  at  the  worst  case,  and  for  some  large  class  of  primes  its  per- 
lormance  is  much  better.  Moreover,  it  may  be  possible  to  significantly  improve  on 
the  peilormance  of  the  algorithm  as  follows.  The  set  S  can  be  partitioned  into  three 
parts  on  each  of  which  y  is  linear  (the  permutation  y  is  even  better  in  this  respect;  we 
have  a  partition  of  5  into  two  parts  on  each  of  which  y  is  linear).  The  runs  of  iteration 
during  which  the  iterates  stay  in  the  same  piece  of  S  may  easily  be  combined  into  a 
single  step. 
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A  Combinatorial  Lemma 

The  perceptive  reader  may  have  suspected  by  now  that  the  theorem  presented  above 
does  not  have  much  to  do  with  primes  or  their  representations  by  squares.  This  is 
indeed  correct,  and  the  theorem  is  a  manifestation  of  a  combinatorial  phenomenon. 
We  have: 

LEMMA.  For  any  two  involutions  a  and,  (3  of  a  finite  set  S,  there  are  only  three 
possibilities  for  any  a  [3-orbit:  (i)  neither  involution  has  a  fixed  point  in  the  orbit,  or 
(ii)  each  of  them  has  a  unique  fixed  point  in  the  orbit,  or  (Hi)  one  of  them  has  two 
fixed  points  in  the  orbit  while  the  other  has  none. 

At  first  glance,  this  statement  may  look  very  strange.  (For  readers  with  a  reasonable 
amount  of  familiarity  with  groups  and  group  actions,  here  is  a  hint  for  a  group- 
theoretic  proof  of  this  lemma:  think  of  the  group  of  isometries  of  a  regular  polygon.) 
But  here  is  an  elementary  (‘graph-theoretic’)  proof. 

Let  y  —  a (3.  Fix  a  y-orbit  T.  If  neither  a  nor  (3  has  a  fixed  point  in  T,  then  there  is 
nothing  to  prove:  we  are  in  case  (i)  of  the  lemma.  So  assume  that  one  of  these  two 
involutions  has  at  least  one  fixed  point.  Then,  arguing  as  in  the  proof  of  the  above 
theorem,  one  sees  that  T  is  fixed  by  both  a  and  (3 .  Thus  T  is  a  union  of  a-orbits 
as  well  as  of  /7-orbits.  If  T  is  a  singleton,  then  we  are  in  case  (ii)  and  again  there 
is  nothing  to  prove.  So  we  may  assume  that  T  has  at  least  two  elements.  Hence  no 
element  of  T  is  fixed  by  y. 

Now  consider  the  graph  G  defined  as  follows.  The  vertices  of  G  are  the  elements 
of  T.  Two  distinct  elements  x,  y  of  T  are  joined  by  an  edge  in  G  if  (and  only  if) 
y  —  a(x)  or  y  —  (3(x)  (i.e.,  if  {x,y}  is  an  orbit  of  one  of  the  involutions).  Clearly, 
this  is  an  undirected  graph.  Note  that,  for  each  x  in  T,  a(x)  and  (3(x)  are  distinct 
elements  of  T — or  else  x  would  be  fixed  by  y,  contrary  to  our  assumption.  It  follows 
that  each  vertex  x  is  of  degree  1  or  2  in  G  (i.e.,  x  is  joined  to  one  or  two  vertices), 
according  to  whether  x  is  or  is  not  fixed  by  one  (and  only  one)  of  the  two  involutions. 
Since  we  have  assumed  that  at  least  one  of  them  has  a  fixed  point  in  T,  it  follows 
that  G  has  at  least  one  vertex  of  degree  one.  Also,  since  y  =  a(3  is  transitive  on  T 
(T  is  a  /-orbit!),  it  follows  that  G  is  connected.  Now,  here  is  the  punch  line:  the  only 
connected  graphs  with  all  vertices  of  degree  <2  and  at  least  one  vertex  of  degree  1 
are  the  paths.  Hence  G  is  a  path.  So  G  has  exactly  two  vertices  of  degree  1  (the  two 
ends  of  the  path)  and  hence  we  are  in  case  (ii)  or  (iii).  This  proves  the  lemma. 

EXERCISE:  Continue  this  argument  to  see  that  if  the  elements  of  T  are  arranged  on 
a  circle  according  to  the  action  of  y,  then  the  two  ends  of  G  are  placed  opposite  to 
each  other.  This  explains  the  symmetry  observed  in  the  theorem. 


A  Prime  Testing  Algorithm? 

If  n  =  1  (mod  4)  is  a  number  (not  necessarily  a  prime)  which  is  not  a  perfect  square, 
then  S,  a ,  (3  may  be  defined  as  before  with  n  replacing  p.  What  happens  if  one  runs  the 
a- (3  algorithm  in  this  case?  Our  combinatorial  lemma  shows  that  if  we  look  inside  the 
orbit  T  containing  the  fixed  point  (1,1,  pp)  of  a,  either  we  may  find  a  fixed  point 
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of  p  and  hence  a  representation  of  n  as  a  sum  of  two  squares,  or  we  find  a  second 
fixed  point  (x,  x,  z)  of  a  and  hence  a  nontrivial  factorisation  n  =  x(x  4-  4 z)  of  n. 
The  second  case  is  bound  to  occur  if  the  square  free  part  of  n  has  a  3  (mod  4)  factor 
(since  in  this  case  n  has  no  representation  as  a  sum  of  two  squares).  In  the  former 
case,  of  course,  we  are  unable  to  decide  whether  n  is  a  prime  or  not  (for  instance, 
this  case  occurs  if  n  is  a  number  of  the  form  k 2  +  4,  even  when  n  is  composite).  If, 
however,  we  happen  to  know  a  two  squares  representation  of  n  and  the  algorithm  is 
lucky  enough  to  produce  a  second  representation,  then  we  can  still  conclude  that  n  is 
composite  (because  a  prime  has  at  most  one  such  representation).  Perhaps  it  will  be 
interesting  to  characterise  those  numbers  n  for  which  the  first  case  occurs. 
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Factoring  Fermat  Numbers 

A  Unique  Computational  Experiment  for  Factoring  Fg 

C  E  Veni  Madhavan 

Fermat  observed  that  the  numbers  Fk  =  if  +  1,  k  =  0,  1,  2,  3,  4  are  prime,  and 
wondered  whether  this  was  true  for  all  k.  Euler  found  that  the  very  next  Fermat 
number  is  composite:  F$  =  232  +  1  =  641  x  6700417.  So  far  it  has  been  verified  that 
Fic,  5  <  k  <  22  are  all  composite.  No  one  knows  whether  any  other  F \  is  prime.  The 
numbers  F \  grow  rapidly  with  k — each  is  almost  a  square  of  the  previous  number — 
and  it  is  a  very  difficult  task  to  decide  their  primality.  We  give  below  an  outline  of 
the  relevant  computational  challenges. 

First  note  that,  if  k  is  odd,  3  divides  2k  +  1  and  in  general,  2a  +  1  divides  2ak  +  1. 
Thus,  if  k  is  not  a  power  of  two,  2k  +  1  is  not  prime.  Fermat  hazarded  a  guess 
that  the  converse  was  also  true.  In  1877,  Francois  Pepin  published  a  necessary  and 
sufficient  condition  which  states  that  Fk,k  >  1  is  prime  if  and  only  if  Fk  divides 
5 (Fk-l)/2  _|_  }  This  condition  is  the  basis  for  determining  whether  Fk  is  prime  for 
any  given  k.  Failure  of  this  condition  means  that  Fk  is  composite.  It  does  not  reveal 
any  information  about  the  factors. 

Today,  sophisticated  number  theoretic  methods  and  powerful  computing  platforms 
are  used  for  testing  primality  and  factoring  of  large  integers.  These  find  applications 
in  many  practical  problems  such  as  cryptography.  The  recent  records  in  Fermat  num¬ 
ber  factoring  have  been  achieved  by  means  of  two  techniques  called  number  field 
sieve  (NFS)  and  elliptic  curve  method  (ECM). 

The  complete  factoring  of  Fg,  which  has  about  150  decimal  digits  was  carried  out 
in  1992  by  a  unique  computational  experiment.  Hundreds  of  computers  in  different 
parts  of  the  world,  working  independently  and  in  their  spare  time  generated  certain 
seed  numbers.  These  computers  sent  their  seeds  by  electronic  mail  to  a  host  computer 
in  USA.  The  host  carried  out  the  combination  of  the  seeds  and  the  factoring.  The 
NFS  method,  requiring  the  generation  of  an  enormous  number  of  such  seeds,  was 
thus  eminently  suitable  for  this  exercise.  However,  this  method  is  quite  difficult  to 
implement. 
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Last  year  the  number  F22  was  determined  to  be  composite,  using  Pepin’s  criterion 
and  extremely  fast  arithmetical  algorithms  implemented  on  supercomputers.  This 
number  of  about  1.3  million  decimal  digits  (about  500  times  as  long  as  this  chapter) 
required  about  10 16  arithmetical  operations  and  about  seven  months  of  real  time. 
Complete  factorization  of  Fermat  numbers  is  known  only  for  k  <  9  and  k  =  11.  No 
prime  factors  of  F\4  and  F20  are  known. 


C  E  Veni  Madhavan 
Department  of  Computer 
Science  and  Automation 
Indian  Institute  of  Science 
Bangalore  560  012 
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The  Class  Number  Problem 

Binary  Quadratic  Forms 

Raj  at  Tandon 

Introducing  the  reader  to  the  notion  of  ‘class  numbers’,  this  chapter  defines 

class  numbers  the  way  they  arose  in  the  study  of  ‘binary  quadratic  equations’. 

Remember  the  formula  V^~i££  that  we  all  learnt  in  school.  Indians  have  a  long 
history  of  work  on  quadratics.  The  high  point  seems  to  have  been  when  Brahmagupta 
in  the  early  seventh  century  gave  a  method  by  which,  knowing  one  solution  (x,  y) 
in  integers  of  the  equation  cX 2  4-  1  =  Y1 2  (Pell’s  Equation!),  where  c  is  a  constant 
integer,  he  could  generate  an  infinite  family  of  solutions.  But  I  am  interested  here  in 
the  above  formula.  I  will  always  assume  that  a ,  b ,  c  are  integers.  The  quantity  b2  -4 ac 
under  the  square  root  sign  gives  us  information  about  the  quadratic  aX2  +  bX  +  c. 
For  instance,  it  tells  us  whether  the  quadratic  has  any  real  roots — it  must  be  positive 
for  this  to  be  so.  It  tells  us  whether  the  quadratic  has  any  rational  roots — it  must 
be  a  perfect  square  for  this  to  be  so.  We  call  it  the  discriminant  of  the  quadratic 
aX2  +  bX  +  c.  The  class  number  problem  is  concerned  with  the  following  questions: 

1)  Given  an  integer  A,  are  there  any  quadratics  F(X)  =  aX  4-  bX  +  c,  (a,  b,  c 
integers)  whose  discriminant  b2  -  4 ac  equals  A? 

2)  If  so,  how  many  such  quadratics  exist?  Can  we  classify  them  in  any  way? 

It  is  obvious  that  if  A  =  b2  —  4 ac,  then  4  divides  A  or  4  divides  A  -  1,  i.e.,  A  =  0 
or  l(mod  4).  This  is  a  necessary  condition  for  there  to  be  an  integral  quadratic  with 
discriminant  A.  It  is  a  simple  exercise  to  show  that  it  is  also  sufficient.  So  we  have 
a  complete  answer  to  the  first  question.  The  second  question  is  considerably  more 
complex. 

Before  proceeding  further  let  me  give  a  quick  recap  of  the  notion  of  an  equivalence 
relation.  A  relation  on  a  set  S  is  called  an  equivalence  relation  if  it  is  reflexive 
(x  ~  x),  symmetric  (x  ~  y  =>  y  ~  x)  and  transitive  (x  ~  y  and  y  ~  z  =>  x  ~  z). 
Let  [x]  denote  the  subset  of  S  consisting  of  elements  equivalent  to  x.  It  is  called 
an  equivalence  class;  note  that  any  two  equivalence  classes  are  either  identical  or 
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disjoint.  Then  S  is  the  disjoint  union  of  distinct  equivalence  classes.  We  denote  the 
set  of  equivalence  classes  by  5/~. 

Suppose  we  replace  X  by  X  +  1  in  the  quadratic  aX 2  +  bX  +  c.  We  have  a(X  +  1 )2 
+  b(X  +  1 )  +  c  =  aX 2  +  (b  +  2a) X  +  {a  4-  b  +  c).  The  discriminant  of  this  is  (b  +  2a)2 
-  4 a(a  4-  b  +  c)  =  b2  —  4 ac.  So  the  discriminant  does  not  change  if  we  replace 
F(  A)  =  aX2  +  bX  +  c  by  F(  A  +  1 )  and  hence  by. F(  A  +  2),  F(  A  +  3), . . . .  Similarly, 
for  F(A  -  1  ),  F(A  -  2)  etc.  Notice  also  that  A  =  b2  -  4 ac  is  symmetric  in  a  and 
c,  i.e.,  if  we  replace  aX2  +  bX  +  c  by  cX2  +  bX  +  a  then  the  discriminant  does  not 
change.  This  indicates  that  it  might  be  better  to  replace  F(A)  =  aX2  +  bX  +  c  by  the 
corresponding  homogeneous  polynomial  in  2  variables  F(X,Y)  =  aX  +  bXY  + 
cY2 .  Then  instead  of  the  transformation  X  -»  X  +  1  we  take  the  transformation 
A  -*  X  +  Y,  Y  Y. 

Let  T  =  ^  q  |  ^  and  W  -  ^  ^  ^  ).  T  and  W  are  members  of  5L( 2,  Z),  the 

group  of  2  x  2  matrices  with  integer  coefficients  and  determinant  1.  If 

A  =  f  G  SL(2t  Z)  and  F  is  a  homogeneous  quadratic  polynomial  in  two 

variables,  we  denote  by  A  F  the  polynomial  obtained  by  replacing  X  by  aX  +  (4Y 
and  Y  by  yX  +  8Y.  Observe  that  if  A,  B  e  SL( 2,  Z)  then  A  •  (B  •  F)  =  AjB  •  F.  It  is 
easy  to  check  that  if  F  has  discriminant  A,  then  so  does  A  •  F  for  any  A  €  SL( 2,  Z). 
Denote  by  S( A)  the  set  of  all  integral  homogeneous  quadratic  polynomials  in  two 
variables  of  discriminant  A. 

We  define  an  equivalence  relation  on  S( A)  by  F  ~  G  if  either  F  =  G  or  there 
exists  a  chain  F\,  F2, . . . ,  Fn  in  S( A)  such  that  F  =  F\,  G  =  Fn  and  each  F/+ 1  is 
either  T  ■  Ft  or  T_1  •  F/  or  W  •  F,;  such  a  chain  is  called  a  chain  from  F  to  G.  It 
is  easy  to  see  that  this  gives  an  equivalence  relation  on  A( A).  Hence  A(A)  can  be 
partitioned  into  equivalence  classes.  We  remark  that  it  can  be  shown  that  SL( 2,  Z) 
is  generated  by  T  and  W  and  hence  two  forms  F  and  G  are  equivalent  if  and  only  if 
there  exists  an  A  e  SL( 2,  Z)  such  that  A  F  =  G. 

Assume  from  now  on  that  A  <  0.  This  is  not  because  the  case  A  >  0  is  uninterest¬ 
ing  but  because  it  is  more  difficult  and  less  is  known  in  this  case.  If  the  discriminant 
of  F(X,  Y)  =  aX2  +  bXY  +  cY2  is  A  then  it  is  also  so  for  -F.  Note  that  A  <  0 
implies  that  a  and  c  have  the  same  sign.  We  define  Ai(A)  to  be  the  subset  of  5(A) 
consisting  of  those  forms  F  for  which  a  and  c  are  positive,  and  52(A)  its  comple¬ 
ment.  Then  F  -F  is  a  bijection  from  S\  (A)  to  52(A).  It  is  also  easy  to  see  that  no 
member  of  S\ (A)  can  be  equivalent  to  any  member  of  52(A).  We  restrict  ourselves 
to  5j  (A). 

Definition.  The  form  F(X,  Y)  =  aX 2  +  bXY  +  cY2  of  S]  (A)  is  said  to  be  almost 
reduced,  if  \b\  <  a  <  c. 

THEOREM.  Each  equivalence  class  in  S\ (A)  has  at  least  one  almost  reduced  form. 

PROOF.  Consider  an  equivalence  class  with  an  element  F(A,  Y)  =  aX2  +  bXY 
+  cY 2  in  it.  If  a  >  c  replace  F  by  W  •  F  =  F\  (say).  Then  F\(X,Y)  =  a\X2 
+  b\XY  +  c\Y2  with  a\  =  c  and  c\  =  a,  and  so  a\  <  c\.  Notice  a  >  a\.  If  now 
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1^1 1  <  fli ,  is  reduced.  If  not,  find  an  integer  n  such  that  \b\  +  2a\n\  <  a \.  Replace 
F\  by  F2  =  Tn  •  F\ .  Then 

F2(X,Y)  =  ai(X  +  nY)2  +  b\ (X  +  nY)Y  +  c\ Y2 

=  a\X2  +  (b\  +  2a\n)XY  +  (a\n2  +  b\n  +  c\)Y2 
=  a2X2  +  b2XY  +  c2Y2 

(say),  with  \b2\  <  02  and  ai  =  a\.  But  now  02  may  not  be  less  than  or  equal  to  C2-  If 
so,  again  apply  W  and  continue  as  before.  After  a  finite  number  of  steps  we  get  an 
almost  reduced  form  (finite  because  a  >  a\  >  02  >  ■  ■  •  >  0). 

COROLLARY.  The  number  of  equivalence  classes  in  S\  (A)  is  finite. 

PROOF.  It  suffices  to  show  that  the  number  of  almost  reduced  forms  is  finite.  If 
aX 2  +  bXY  +  cY 2  is  almost  reduced  of  discriminant  A  then 

a  <  c  =>  4 a2  <  4 ac  =  b2  -  A  <  a2  -  A. 

Hence  3 a2  <  |A|.  Since  a  is  a  positive  integer,  there  are  only  finitely  many  possible 
values  of  a  and  hence  of  b.  Once  a  and  b  are  given,  c  is  uniquely  determined. 

A  natural  question  to  ask  is:  is  there  precisely  one  almost  reduced  form  in  each 
equivalence  class?  The  answer  is — almost  but  not  quite. 

We  know  that  X2  +  ^  has  two  non-real  roots  (because  A  <  0),  say  r  and 

f.  One  of  these  (say  r)  will  lie  in  the  upper  half  plane  {x  +  iy  :  x,  y  e  S,  y  >  0). 
Hence  F(X,  Y)  =  aX 2  +  bXY  +  cY 2  =  a(X  +  tY)(X  +  f  Y)  with  b  =  a(r  +  f ) 
and  c  =  arf.  Hence  to  say  that  F  is  almost  reduced  is  equivalent  to  saying  that 
\t  +  f|  <  1  and  rf  >  1,  i.e.,  re  S  where  S  is  the  region  shown  in  Figure  9.1 
(including  the  boundary). 


Notice  that  if  r  is  on  the  left  vertical  boundary  of  S  then  r  +  1  is  on  the  right 
vertical  boundary  which  is  also  in  S.  Similarly,  if  r  is  on  the  curve  at  Y  then  ~  is  at 
Y' .  In  view  of  this  we  make  the  following  definition: 

DEFINITION.  We  say  that  F(X,Y)  =  aX 2  +  bXY  +  cY 2  is  reduced  if  the  corre¬ 
sponding  r  €  S  but  r  £  the  left  boundary  of  S,  i.e.,  the  left  vertical  boundary  and 
curve  7.  This  is  equivalent  to  saying  that  \b\  <  a  <  c,  and  in  case  a  =  \b\  then  b  >  0, 
and  in  case  a  =  c  then  b  >  0.  We  now  have  the  expected  theorem. 
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A 

possible  a 

possible  b,  c 

reduced  forms 

h(A) 

-3 

a  =  1 

b  -  1,  c  =  1 

X2  +  XY  +  Y2 

1 

-4 

a  =  1 

b  =  0,  c  =  1 

X2  +  Y2 

1 

-7 

a  —  1 

b  =  l,c  =  2 

X2  +  XY  +  2Y2 

1 

-8 

a  —  1 

b  =  0,  c  =  2 

X2  +  2  Y2 

1 

-11 

a  =  1 

b  =  1,  c  =  3 

X2  +  XY  +  3Y 2 

1 

-12 

a  -  1  or  2 

b  =  0,  c  =  3  if  a  =  1 

X2  +  3  Y2 

2 

b  =  2,  c  =  2  if  a  =  2 

2(X2  +  XY  +  Y2) 

-15 

a  -  1  or  2 

b  =  1,  c  =  4  if  a  =  1 

X2  +  XY  +  4  Y2 

2 

b  =  1,  c  =  2  if  a  =  2 

2X2  +  XY  +  2  Y2 

-16 

a  —  1  or  2 

b  =  0,  c  =  4  if  a  =  1 

X2  +4Y2 

2 

b  —  f),  c  —  2  if  a  ~  2 

2  (X2  +  Y2) 

-19 

a  —  1  or  2 

b=\,c  =  5  if  a=\ 

X2  +  XY  +  5  Y2 

1 

-20 

a  -  1  or  2 

b  =  0,c  =  5ifa=\ 

X2  +  5  Y2 

2 

b  =  2,  c  -  3  if  a  —  2 

2X2  +2XY  +  3Y2 

-23 

a  -  1  or  2 

b  -  1,  c  —  6  if  a  -  1 

X2  +  XY  +  6Y2 

3 

b  —  l,c  =  3  if  a  =  2 

2X2  +  XY  +  3  Y2 

/?  =  -!,  c  =  3  if  <3  =  2 

2X2  -  XY  +  3Y2 

THEOREM.  In  each  equivalence  class  of  S\  (A)  there  is  precisely  one  reduced  form. 
The  chart  gives  a  list  of  reduced  forms  for  low  values  of  |A|  is  shown;  h(A)  is  the 
number  of  reduced  forms. 

We  notice  that  some  forms  in  the  list  are  constant  multiples  of  forms  which  came 
earlier  in  the  list. 

DEFINITION.  A  form  aX 2  +  bXY  +  cY 2  is  said  to  be  primitive  if  (a,  b,c)  =  1. 

We  let  h(A)  be  the  number  of  primitive  reduced  forms  of  discriminant  A  in  S i  (A). 
h( A)  is  known  as  the  class  number  of  the  forms  with  discriminant  A.  Notice  that 
h{ A)  is  1  for  A  =  -3,  -4,  -7,  -8,  -1 1,  -12,  -16,  -19  in  the  list. 

DEFINITION.  An  integer  A  =  0  or  1  (mod  4)  is  said  to  be  a  fundamental  discriminant 
if  it  is  not  of  the  form  Ao n2  where  Ao  is  a  discriminant  and  n  an  integer  greater  than  1 . 

For  instance,  -12  and  -16  are  not  fundamental  discriminants.  Notice  that  if  A  is 
fundamental,  then  a  form  of  discriminant  A  is  always  primitive.  Notice  also  that  if  A 
is  fundamental,  then  it  cannot  have  an  odd  square  factor.  We  will  see  later  that  if  A 
is  fundamental  then  it  has  another  interpretation. 

In  1934,  Heilbronn  showed  that  h( A)  — ►  oc  as  A  ->  —  oo  from  which  it  follows 
(how?)  that  given  any  natural  number  N  there  are  only  a  finite  number  of  negative 
fundamental  discriminants  A  for  which  the  class  number,  h{ A)  -  N .  One  of  the 
questions  that  suggests  itself  from  the  above  is:  what  are  the  negative  fundamental 
A  for  which  h( A)  is  1?  Above  we  have  given  six  such  A’s.  Here  are  three  more: 
A  =  -43,  -67,  -163.  In  1800,  Gauss  conjectured  that  there  were  no  more. 
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In  1936,  Siegel  showed  that  for  every  e  >  0  there  exists  a  positive  constant  Ce 

such  that  h( A)  >  Ce\A\i~£ .  However,  the  result  showed  the  existence  of  C€  but  not 
how  to  compute  it.  His  proof  showed  that  there  cannot  be  two  ‘large’  values  of  |A|’s 
for  which  h( A)  is  small.  From  this  it  was  proved  that  there  is  possibly  just  one  other 
A  (call  it  A i o)  for  which  h( A)  =  1  and  this  A  must  be  very  large  indeed.  In  1966, 
Harold  Stark,  in  his  thesis,  showed  that  Ajo  does  not  exist1 .  The  same  methods  were 
applied  to  the  negative  A  for  which  h( A)  =  2  and  it  was  found  that  there  are  18  such 
A’s,  the  largest  value  of  |A|  being  427  (Baker,  Stark,  Montgomery  etc).  In  1986, 
using  powerful  methods  in  algebraic  geometry,  D  Goldfeld,  B  H  Gross  and  D  Zagier 
solved  the  problem  of  fundamental  negative  A  with  h{ A)  =  3. 

REMARK.  The  A  for  which  /7(A)  =  1  have  remarkable  properties.  For  instance,  if 
p  is  a  positive  prime  number  which  is  congruent  to  3(mod  4)  and  h{-p )  =  1  then 
x2  +  x  +  is  a  prime  number  for  all  x  such  that  0  <  x  < 
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1  In  1954,  an  amateur  mathematician  Heegner,  in  Germany,  had  proved  the  same  result  but  his  proof 

had  some  gaps  which  were  responsible  for  mathematicians  expressing  reservations  about  the  proof.  But 
later  it  was  shown  by  Stark  that  the  arguments  of  Heegner  can  be  made  rigorous  and  he  managed  to 
make  Heegner’s  proof  work.  In  fact,  Heegner’s  ideas,  in  particular  his  construction  of  what  are  now  called 
Heegner  points ,  have  proved  to  be  very  fruitful  in  later  work  on  elliptic  curves. 
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The  Class  Number  Problem 

An  Introduction  to  Algebraic  Number  Theory 

Raj  at  Tandon 

This  chapter  gives  an  introduction  to  ‘algebraic  number  theory’,  defines  class 
numbers  for  finite  extensions  of  the  field  of  rational  numbers  and  proves  that 
in  the  context  of  quadratic  fields,  this  definition  coincides  with  the  definition  of 
class  numbers  via  binary  quadratic  forms  given  in  the  previous  chapter. 

We  have  seen  in  the  previous  chapter  that  some  seemingly  innocuous  questions  start¬ 
ing  with  the  formula  ~b±  lead  to  fairly  deep  mathematics.  This  is  typical  of 

the  subject.  It  is  so  important  to  ask  the  right  question — “  ask  an  impertinent  question 
and  you  get  a  pertinent  answer  ”. 

The  roots  of  aX 2  +  bX  +  c  =  0  are  given  by  where  A  =  (b2  -  4 ac), 

i.e.,  they  are  of  the  form  x  +  yV A  with  x  and  y  rational.  The  set  Q(  VA)  of  elements 
of  the  form  x  +  y  VA  with  x  and  y  rational,  forms  a  subfield  of  the  field  of  complex 
numbers,  C  •  Q(v^A)  is  also  a  vector  space  over  the  rationals  if  we  define  scalar  mul¬ 
tiplication  by  A(x  +  y  VA)  =  4x  +  XysfN.  { 1,  VA)  is  a  basis  of  Q(VA)  over  Q,  and 
Q  is  a  subfield  of  Q(VA).  This  process  can  easily  be  generalised.  For  instance,  let 
p  be  a  prime  and  f  =  e2ni'lp .  Let  Q(f)  be  the  set  of  complex  numbers  of  the  form 
xq  +  xj£  4-  X2 C2  +  •  •  •  +  xp-2tp~2  with  x/  rational.  Note  that  1  +  f  +  C,2  +  C3  +  •  •  • 
4~£p~ 1  =  0  so  £p~l  can  be  written  in  terms  of  1,  f,  f2,  f3, . . . ,  (p~2.  Check  that  Q(f) 
is  a  subfield  of  C  containing  Q  and  that  1,  f ,  £2, . . . ,  £p~2  is  a  basis  of  Q(f)  over 
Q  with  scalar  multiplication  being  defined  in  the  obvious  way.  These  are  examples 
of  fields  containing  Q  which  are  finite  dimensional  as  vector  spaces  over  Q.  Such 
fields  are  known  as  algebraic  number  fields  and  were  the  object  of  detailed  study  by 
Dedekind,  Kronecker  and  Kummer  in  the  19th  century.  Amongst  the  several  motiva¬ 
tions  for  studying  such  fields  were  three  problems  suggested  by  Greek  geometers: 

(i)  To  trisect  any  given  angle. 

(ii)  To  construct  a  cube  whose  volume  is  twice  that  of  a  given  cube. 

(iii)  To  construct  a  square  equal  in  area  to  a  given  circle. 
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These  constructions  were  to  be  done  by  ‘ruler  and  compass  only’  in  the  manner 
that  we  are  taught  at  school.  The  second  problem  boils  down  to  being  able  to  con¬ 
struct  by  ruler  and  compass  the  real  root  of  A3  -  2.  Galois  and  Abel  looked  at  such 
problems  and  their  work  gave  a  huge  impetus  to  the  systematisation  of  algebra  and 
algebraic  number  theory. 

The  examples  given  above,  Q(VA)  and  Q(f),  have  been  generated  by  single  ele¬ 
ments  (\/A  and  f)  which  satisfy  some  polynomial  with  rational  (in  fact,  integral) 
coefficients  (X2  -  A,  Xp  -  1  respectively).  Indeed,  it  can  be  shown  that  any  subfield 
of  C  containing  Q  which  is  n-dimensional  as  a  vector  space  over  Q  consists  of  ele¬ 
ments  of  the  form  xo  +  x  i  a  +  X2cr  +  •  •  •  +  xn _  \  an~ 1  where  the  x/  are  rationals  and  a 
is  a  complex  number  which  satisfies  a' polynomial  equation  of  degree  n  with  rational 
coefficients. 

The  first  thing  we  would  want  to  know  about  such  fields  is  whether  they  have  a 
subring  in  them  in  much  the  same  way  that  Q  contains  Z  and  every  element  of  Q 
is  a  ratio  of  two  (one  non-zero)  elements  of  Z.  One  ‘natural’  possibility  in  Q(VA) 
could  be  Z  -1-  ZVA,  i.e.,  elements  of  the  form  a  +  bV. A  with  a  and  b  integers  or  in 
other  words  Z-linear  combinations  of  the  basis  1,  VA.  Similarly  one  could  consider 
Z  +  Z£  +  Z£2  +  Z£3  +  •  •  •  +  Z^p~2  in  Q(£).  But  immediately  one  would  recognise 
a  difficulty  in  basing  a  definition  which  depends  on  the  choice  of  a  basis.  For  instance, 
Q(\/A)  —  Q(V4A)  but  Z  +  ZVA  ^  Z  4-  ZV4A  or  observe  that  if  p  =  3  then 

£  =  e2m!2  =  — 1  +9‘^~3 ,  so  Q(C)  =  Q(V-3)  but  Z  4-  Z£  ^  Z  +  ZV^43.  To  get  around 
the  problem  of  square  factors  of  A,  we  will  henceforth  assume  that  A  is  a  fundamental 
discriminant.  See  previous  chapter.  Hence  the  only  square  factor  A  can  have  is  4. 

We  have  already  seen  that  the  fields  above  are  generated  by  elements  which  satisfy 
a  monic  (leading  coefficient  1)  polynomial  with  rational  coefficients.  In  fact,  every 
element  a  +  bV. A  in  Q(VA)  satisfies  the  polynomial  X2  -  2 aX  4-  ( a 2  -  b2 A)  =  0. 
This  suggests  an  alternative.  Why  not  consider  those  elements  of  Q(\/A)  (  or  Q(O) 
which  satisfy  a  monic  polynomial  with  coefficients  in  Z?  Such  elements  are  called 
algebraic  integers  (in  the  given  field).  Do  such  elements  form  a  subring  I,  i.e.,  are 
they  closed  under  addition  and  multiplication?  The  answer  is  ‘yes’.  Observe  that 
a  4  bsfA  will  be  an  element  of  the  given  type  provided  2a  G  Z  and  a2  -  b2 A  G  Z. 
Suppose  then  that  a  +  bVK  and  c  +  dV A  are  such  that  2a ,  2c  G  Z  and  a2  -  b2 A,  c2  - 
d2 A  G  Z  .  Observe  that  2 {a  +  c)  G  Z  and  (a  +  c)2  -  (Jb  +  d)2 A  =  (<2Z  -  b2  A)  +  (c2  - 
dr  A)  +  2  ac  -  2bdA.  We  say  that  a  rational  number  is  a  half  integer  if  it  is  of  the  form 
1/2,  where  /  is  odd.  We  make  the  following  observations  which  can  easily  be  proved 
by  the  reader:  for  a,  b  e  Q,  2 a  and  a 2  -  b2  A  are  integers  implies 

(i)  2b  G  Z  since  A  has  no  square  free  factor  other  than  possibly  4; 

(ii)  if  A  is  even,  then  a  must  be  an  integer  and  b  either  an  integer  or  half  integer; 

(iii)  if  A  is  odd,  a  and  b  must  be  either  both  integers  or  both  half  integers. 

In  all  cases  it  can  then  be  seen  that  if  2a,  2c  G  Z  and  a2  -  b2  A,  c2  -  dr  A  G  Z  then 
2 ac  -  2bdA  G  Z  and  therefore  that  (a  +  c)2  -  (b  +  d)2  A  G  Z.  On  the  other  hand, 

( a  -I-  bVK)  •  (c  +  d\/ A)  =  ac  +  bdA  +  {ad  -1-  bc)yj A  and, 

{ac  +  bdA,)2  -  {ad  +  be)2 A  =  {a2  -  b~  A)  •  (c2  -  t/2A) 

are  both  in  Z.  Hence  /  is  indeed  closed  under  addition  and  multiplication. 
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Exercise.  Show  that 

(i)  in  1)  we  have  I  =  [a  +  bx^ T|  a,  b  G  Z) 

(ii)  in  Q(%/=3),  /  =  {£±^3|  a,b  e  Z,a  =  b  (mod  2))  =  Z  +  Zf,  where 

=  ~ 1 is  a  cube  root  of  unity. 

Would  every  element  of  Q(VA)  be  a  ratio  of  two  elements  of  /?  We  note  that  Z  -f 

ZVA  C  1  and  |  +  £  VA  =  so  this  is  trivially  true.  What  other  properties 

of  Z  would  we  like  I  to  have?  The  best  would  be  unique  factorisation.  In  Z  we  have 
the  notion  of  a  prime  number  and  we  know  that  every  number  can  be  written  upto 
sign  uniquely  as  a  product  of  distinct  prime  powers,  viz, 

n  =  ±px  p2...pr 

f\  h  f 

where  the  pi  are  distinct  primes  and,  moreover,  if  n  is  also  equal  to  ±qx  q2  . . .  qss , 
then  after  changing  the  order  of  the  qi  s,  if  necessary,  we  have  r  =  s,  pi  =  qi  and 
<?/  =  fi  for  all  i. 

Imagine  the  usefulness  of  having  such  a  property  in  /.  For  instance,  consider  Q(0 
as  above  and  the  ring  of  integers  /  in  Q(0,  i.e.,  the  set  of  all  elements  in  Q(f )  which 
satisfy  a  monic  polynomial  in  Z[X],  the  ring  of  polynomials  in  one  variable  with 
integer  coefficients.  Suppose  there  exist  non-zero  integers  x,  y,  z  such  that  xp  +  yp  = 
zp.  Then, 

xp  =  zp  -  yp  =  (z-  y)(z  -  C y)(z  -  C2y)-  •  •  (z  -  Cp~ly)-  (1) 

It  is  easy  to  see  that  x  e  I  and  z~Cy  G  E  If  we  have  unique  factorisation  in  /,  there 
is  just  a  chance  that  (1)  may  give  us  a  contradiction  to  unique  factorisation  (or  allow 
us  to  use  the  method  of  descent)  and  we  may  prove  Fermat’s1  last  theorem!  It  is  just 
possible  that  Fermat  had  some  such  proof  in  mind  when  he  wrote  in  the  margin  .... 

We  would  first  need  the  notion  of  a  prime  element  in  /.  This  is  accomplished  more 
or  less  as  in  Z — negatives  allowed.  So  we  consider  -2,  -3,  -5, . . .  also  as  primes. 

DEFINITION  1.  An  integer  n  is  a  prime  if  whenever  n  is  written  as  a  product  ab  of 
two  integers  then  either  a  or  b  must  be  ±1.  Note  that  ±1  are  the  only  units  in  Z,  i.e., 
elements  in  Z  with  a  multiplicative  inverse. 

There  is  another  way  of  defining  a  prime  number. 

DEFINITION  1\  An  integer  P  ^  ±  1  is  a  prime  if  and  only  if  whenever  p  divides  a 
product  of  integers  ab  then  p  must  divide  either  a  or  b. 


1  Incidentally,  this  is  what  Gauss  had  to  say  hbout  FLT.  “I  confess  that  Fermat’s  theorem  as  an  isolated 

proposition  has  very  little  interest  for  me  because  I  could  easily  lay  down  a  multitude  of  such  propositions 
which  one  could  neither  prove  nor  dispose  off.”  Gauss  said  that  FLT  had  induced  him  to  recall  some  of  his 
earlier  ideas  in  higher  arithmetic  but  that  he  was  not  in  a  position  to  go  back  to  that  work  because  of  his 
circumstances.  “Still  I  am  convinced  that  if  I  am  as  lucky  as  I  dare  hope  and  if  I  succeed  in  taking  some 
of  the  principal  steps  in  that  theory,  then  Fermat’s  theorem  will  appear  as  only  one  of  the  least  interesting 
corollaries.” 
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Recall  that  if  n  is  an  integer  then  nZ,  the  set  of  multiples  of  n,  forms  an  ideal  in 
Z  (an  ideal  /  in  a  commutative  ring  R  is  an  additive  subgroup  of  R  which  has  the 
property:  x  £  J ,  r  £  R  implies  rx  £  J).  If  an  ideal  /  in  a  ring  satisfies  the  property: 
ab  £  I  implies  either  a  £  I  or  b  £  I  it  is  called  a  prime  ideal.  So  saying  that  the 
integer  p  is  a  prime  number  is  the  same  as  saying  that  pZ  is  a  prime  ideal  in  Z.  It  is 
easy  to  see  that  the  two  definitions  we  have  given  are  equivalent  in  Z. 

Based  on  the  above,  we  could  define  in  an  arbitrary  commutative  ring  with  unity  R 
(all  our  rings  will  be  so)  an  element  n  to  be  prime  either  by  requiring  that  whenever 
k  =  ab ,  either  a  or  b  must  be  a  unit  in  R,  or  by  requiring  that  the  ideal  ttR,  consisting 
of  all  multiples  of  /r,  is  a  prime  ideal.  Unfortunately,  in  an  arbitrary  ring  the  two 
definitions  are  not  equivalent.  An  element  n  which  satisfies  the  first  property  is  said 
to  be  irreducible  whereas  if  nR  is  a  prime  ideal  we  call  n  a  prime.  In  integral  domains 
(commutative  rings  with  no  zero  divisors)  all  primes  are  irreducible  but  not  vice- 
versa.  (Exercise:  Prove  this.) 

A  domain  in  which  every  non-zero  non-unit  can  be  written  as  a  product  of  irre- 
dilcibles  in  an  essentially  unique  way,  that  is  upto  order  and  multiplication  by  units 
(6  =  2-  3  =  3-  2  =  (-2)  •  (-3)  =  (—3)  •  (—2))  is  called  a  unique  factorisation  domain 
(UFD).  Clearly,  Z  is  a  UFD  and  it  is  easy  to  check  that  J  =  Z  +  Zi  is  also  a  UFD. 

Z  has  another  property  which  is  somewhat  stronger — every  ideal  in  Z  is  of  the 
form  nZ  where  n  is  an  integer.  A  domain  D  which  has  the  property  that  every  ideal 
in  it  is  of  the  form  xD  for  some  x  in  D  is  called  a  principal  ideal  domain  (PID) 
and  every  PID  is  a  UFD.  If  we  could  show  that  the  ring  of  integers  I  in  an  algebraic 
number  field  is  always  a  PID  then  we  could  use  the  argument  given  above  for  FLT. 
Unfortunately,  /  is  not  always  a  PID.  For  instance,  consider  Q(V-20);  then  I  - 
Z  +  ZV-5  and  we  have  6  =  2  •  3  =  (1  +  v^5)(l  -  V-5).  It  is  easy  to  check  that  2, 
3,  1  ±  are  all  irreducible  elements  in  /.  We  remark  that  the  ring  of  integers  of 
an  algebraic  number  field  is  a  UFD  if  and  only  if  it  is  a  PID. 

Recall  that  if  A  and  13  are  two  ideals  in  a  ring  R  then  we  define  their  product  as 
A  •  13  -  {  Yj'tZ"  aibi\ai  €  A,  bi  £  13,  for  some  n}.  This  is  also  an  ideal.  Though  I 
is  not  always  a  PID  it  is  true  that  every  ideal  in  /  can  be  written  uniquely,  except  for 
order,  as  a  product  of  prime  ideals.  This  gives  us  the  first  hint  that  the  concept  of  an 
ideal  may  be  at  least  as  important  as  the  notion  of  an  element.  Note  that  in  a  PID 
the  two  notions  are  almost  the  same  as  every  ideal  is  generated  by  a  single  element 
which  is  uniquely  determined  upto  units. 

So  if  I  is  not  always  a  PID  then  how  ‘bad’  is  it?  The  set  I  of  ideals  in  /  under 
the  product  defined  above  form  a  semigroup  (I  itself  is  the  identity).  We  define 
an  equivalence  relation  on  this  set  I  as  follows:  A  ~  B  if  there  exist  a,  ft  £  I 
such  that  al  •  A  =  fl  •  B.  It  is  easy  to  check  that  this  gives  us  an  equivalence 
relation  on  I  and  the  product  on  I  induces  a  product  on  the  set  of  equivalence  classes 
Z/~:  [A]  •  [13]  =  [A  •  B],  The  crucial  point  here  is  to  check  that  *•’  as  defined 
above  is  well  defined,  i.e.,  if  A  ~  A'  and  B  ~  B'  then  A  •  B  ~  A'  •  B' .  The  set 
of  equivalence  classes  Z/~  with  this  product  is  actually  a  group.  It  is  one  of  the 
fundamental  theorems  of  algebraic  number  theory  that  this  group  is  finite — not  just 
for  quadratic  extensions  of  Q  but  for  any  finite  extension  of  Q.  The  order  of  this 
group  is  called  the  class  number  of  the  extension.  The  class  number  of  Q(VA)  will 
be  denoted  by  //(A).  Note  that  the  class  number  is  one  if  and  only  if  I  is  a  PID. 
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Now  let  A  be  a  negative  fundamental  discriminant,  i.e.,  a  negative  integer  A  which 
is  congruent  to  0  or  1  modulus  4  and  which  cannot  be  written  in  the  form  Ao«2  where 
Ao  is  another  discriminant  and  n  is  an  integer  greater  than  1.  Hence  4  is  the  only 
possible  square  factor  of  A.  Recall  that  we  have  defined  h{ A)  to  be  the  number  of 
equivalence  classes  of  primitive  binary  integral  quadratic  forms.  Remarkably. 

THEOREM.  h( A)  =  h'( A).  In  order  to  prove  this  we  first  observe  that  if  a  =  a  +  bV A 
is  in  the  ring  of  integers  I  of  Q(v/A)  then  so  also  is  a  —  a  —  bV A.  Hence  so  also 
is  aa  which  is  an  integer.  Hence  if  A  is  any  non-zero  ideal  of  I  then  4nZ  /  (0). 
Clearly  A  n  Z  is  an  ideal  in  Z  so  A  n  Z  =  aZ  for  some  integer  a  >  0.  Observe  also 
that  any  non-zero  ideal  of  /  cannot  be  contained  in  Z. 

In  order  to  make  life  a  bit  easier,  we  will  assume  in  what  follows  that  A  is  odd  and 
hence  that  /  =  Z  +  Z[(l  +  VA)/2]  (proof?).  Let  A  be  an  ideal  in  I.  Define 


J  = 


f  7,  i  +  Va  . 

yeZ\r-—1—  +  seA 


for  some  s  E  Z.  Then  J  is  an  ideal  in  Z  and  since  A  Z,  /  is  non-zero.  Let  J  —  t Z, 
t  >  0.  Then  there  exists  an  s  G  Z  such  that  t[(l  +  VA)/2]  +  s  €  A.  We  claim  that 
A  =  aZ  +  [(r  +  2s  +  tV A)/2]Z.  Clearly,  the  right-hand  side  is  contained  in  A.  Let 
a  =  u  +  v[(l  +  VA)/2]  g  A  Then  v  e  i  so  v  =  tv'  for  some  v'  e  Z.  Therefore 

,(t  +  2s)  +  i\rK  ,1  +  VA  ,(r  +  2j)  +  r\/A 

2  2  2 
—  u  —  sv'  e  A  O  Z  =  aZ. 

Therefore,  a  e  aZ  +  [(t  + 2s  +  tV A)/2]Z.  Hence,  every  ideal  A  in  I  is  of  the  form 
aZ  +  [(b  +  cV A)/2]Z,  a  >  0,  c  >  0.  For  this  to  be  an  ideal,  it  must  be  closed  under 
multiplication  by  (1  +  VA)/2.  Hence  a[(l  -1-  VA)/2]  g  aZ  +  [(6  +  cVA)/2]Z,  i.e., 
there  exist  integers  m,  n  such  that  a[(l  +  VA)/2]  =  ma  +  n[(b  +  c\/A)/2]  =>  a  =  nc 
and  1  =  2m  +  |  i.e.,  c  divides  a,  c  divides  b  and  £  is  odd.  Let  a  =  tc,  b  =  uc,  u  odd. 

Then  aZ  +  [( b  +  cVA)/ 2]Z  =  tcZ  +  [(mc  -I-  cVA)/2]Z  =  c[*Z  4-  [(w  +  VA)/2]Z]. 
Hence,  every  ideal  A  in  /  is  of  the  form  c[rZ+ [(«+ VA)/2]Z],  with  c  >  0,  t  >  0  and  u 
odd.  Further,  since  A  is  closed  under  multiplication  by  (1  -f  VA)/2,  c[w+ VA)/2][(1  + 
VA)/2]  €  A.  Hence  there  exist  integers  h,  k  such  that  [(u  +  A)  +  (1  +  m)VA]/4  = 

/!f  +  *[(m  +  VA)/2],  Therefore,  k  =  and  +  [«(1  +  «)/4], 

Hence  A  =  u2  -1-  4/?r.  We  have  proved: 


PROPOSITION.  Every  ideal  in  I  is  of  the  form  t[aZ  +  {(6  +  VA)/2}Z]  for  some 
integers  a ,  b,  t  with  t  >  0,  a  >  0  and  such  that  there  exists  an  integer  c  with  A  = 
Zr  —  4<3c. 


PROOF  of  the  Theorem.  We  denote  by  [ aX 2  +  bXY  +  cY ~]  the  equivalence  class 
of  the  form  aX~  -t-  bXY  +  cY 2  in  S\ (A).  We  denote  by  [^4]  the  equivalence  class  of 
the  ideal  A  in  I.  Define 


e  :  A i(A)/-  — ■»  1/ 
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[aX2  4-  bXY  4-  cY2] 


aZ+b±VAz 


Then  the  proposition  we  have  proved  above  shows  that  e  is  subjective.  We  need,  of 
course,  to  show  that  e  is  well  defined.  For  this  we  must  show  that  if 

A  ■  (aX2  4-  bXY  4-  cY2)  =  a'X2  4-  b'XY  +  c'Y 2 


where  A  is  either  ^  j  j  or  ^  ^  ^ 

1  1 


then  aZ  4- 


b  +  \AK 


Z  ~  a!Z-\- 


b’  +  V A 


If  A  =  ^  j  ^  then  a'  -  a  and  //’=  b  4-  2a  which  implies  that  aZ  4-  — j^Z 


=  a'  Z+  [(//  +  VA)/2]Z. 
0  1 


If  A  = 


1  0 


then  a!  -  c  and  b'  =  —b  so 


irj  ,  b'  +  \fKrj  _  rf  , 

a  Z  4" - - - Z  —  cZ  + 


■b  4-  \/Az  =  ^-Az  +  -fe  +  \/A 


4a 


Z 


Therefore,  a^a'Z  4  b~  Z  j  =  (  ,^aZ  4-  ^  +  ^^Z^  and  we  have  proved 

what  was  required. 

In  order  to  prove  our  theorem  we  must  show  that  e  is  a  bijection.  Only  the  injec¬ 
tivity  of  e  is  left.  Before  proving  injectivity  we  make  two  remarks: 

(a)  If  A  and  B  are  two  ideals  in  /  then  they  are  equivalent  if  there  exists  a,  p  e  I 
such  that  a.  A  =  (3.B.  But  this  is  equivalent  to  aaA  -  a(3B  and  aa  is  a  positive 
integer.  Hence  A  ~  B  if  and  only  if  there  exists  an  integer  t  >  0  and  pel 
such  that  t  ■  A  =  P  -  B. 

(b)  If  a,  p  e  I  and  aZ  4-  pZ  -  yZ  4-  SZ  then  there  exists  an  integral  2x2  matrix 
A  of  determinant  ±1  such  that  A  • 

Now  suppose  that 


e(  \aX2  +  bXY  +  cY 2] )  =  e(  [a'X2  +  b'XY  +  c'Y2]), 


i.e., 


„  ,  b  +  TK  lrj ,  b'  +  3 a 

aZ  4" - Z  ~  a  Z  4- - — - Z. 


>Va 


Hence  there  exists  an  integer  t'  >  0  and  a  =  ^-1 
»  +  2^Z)  =  t'  •  ( a'Z  +  b~. j^Z)  =  ^l(say).  We  must  show  that 

a'X 2  +  b'XY  +  c'Y 2  =  A  ■  ( aX 2  +  bXY  +  cY2) 


in  I  such  that  a  •  (aZ  4- 


for  some  A  in  SL(2,  Z). 
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CASE  1 :  Let  q  =  0  and  t  =  p/2.  Then  atZ  =  a't'Z  =  An Z.  We  may  without  loss  of 
generality  assume  that  at  =  a't'  and  hence  t  >  0.  There  exist  integers  m,  n  such  that 
t[(b+V A)/2]  =  ma't'+nt'[(b'+V. A)/2]  which  implies  that  t  =  nt'  and  hence  a'  =  na. 
There  also  exist  integers  k,  /  such  that  t'[(b'  4-  VA)/2]  =  kta  4-  lt[{b  4-  VA)/2]. 
Hence  In  -  1  or  n-  1,  t  =  t',  a  =  a'  and  6'  =  b  +  2afc.  It  is  now  easy  to  see  that 


a' A2  +  b'XY  +  c'Y 2 


•  ( aX 2  4-  />AT  4-  cT2). 


CASE  2:  (q  ±  0).  In  view  of  case  1  we  may  assume  that  (p,  q)  =  1.  By  the  proposi¬ 
tion  above  and  remark  (b)  there  exists  an  integral  matrix  A  =  (  *  ^  j  of  determi- 

\  z  w  / 

nant  ±1  such  that 


/ 


a 


p  +  g 


VA 


6+  VA  ^  ,  f  p  +  qVA 


\ 


t'o! 


,/  6'  +  VA 


1  0 


or,  in  fact,  by  multiplying  by  the  matrix  ^  ^  ^ 

A  is  in  SL(2,  Z)  and 


i,  if  necessary,  we  can  assume  that 


/ 


a 


P  +  Q 


Va 


b  + 


VA 


p  +  g 


Va 


2 


\ 


t 


±t'a' 

/  b'  +  V A 


(2) 


Therefore,  x<a[(p  4-  gVA)/2]  +  y[{(6p  4-  <?A)  4-  (p  4-  fo7)VA)/4]  =  ±t'a'  which 
implies  that  xa(p/ 2)  4-  y[(6p  4-  pA)/4]  =  ±t' a'  and  xa(q/2)  +  y[(p  4-  bq)/ 4]  =  0. 
Hence,  2xaq  =  -y(p  4-  bq).  Let  e  be  the  positive  g.c.d.  of  2 a  and  p  4-  bq.  Then 
x[{2aq)/e\  =  — y[(p  4-  bq)/e\,  so  ^  divides  y  and  y  -  ^  .  r  for  some  integer  r. 
Then  x  =  — r[(p  4-  bq)/e].  Since  (x,  y)  =  1  we  get  r  =  ±1.  A  simple  calculation  now 
shows  that  xa(p/2)  +  y[(bp  +  qA)/4]  =  ±t' a'  =  -  —  aa.  Hence,  keeping  in  view  the 
various  signs,  we  get  t' a'  =  (2 a/e)aa.  Furthermore,  since  xw  -  yz  =  1,  substituting 
the  values  of  x  and  y  given  above,  we  get  w(p  4-  bq)  +  2 aqz  =  -re.  We  further 
get  from  (2)  that  2a[(p  4-  qV A)/2]  4  w[{(H  CA)/2}  {(p  -I-  qV A)/2)]  =  r'[(6'  4- 
VA)/2]  which  implies  that  2 zap  4-  w(6p  4-  qA)  =  2t'b'  and  2^<3p  4-  w(p  4-  bq)  =  2/', 
i.e.,  -re  =  2f .  Hence  2zap  4-  w(6p  4-  qA)  =  —reb'.  It  is  now  easy  to  check  that 
A!  .{aX2  4-  bXY  4-  cY2)  =  a'A2  4-  //Ay  4-  c'Y2 .  For  instance,  the  coefficient  of  A2, 
if  we  replace  X  by  xX  4-  zY  and  Y  by  yX  4-  w7  in  the  expression  aX2  4-  bXY  4-  cY 2, 
is  ax2  4-  bxy  4-  cy2.  Substituting  x  =  — r[(p  4-  bq)/e ]  and  y  =  ^  •  r  and  using  the 
fact  that  t' a'  —  (2 a/e)aa  and  2 1'  =  -re,  we  get  ax2  4-  6xy  4-  cy2  =  a! .  Similarly,  the 
coefficient  of  XY  on  the  required  transformation  is  2 axz  4-  bxw  4-  byz  4-  2cyw  which 
on  substitution  is  just  b' .  Therefore  A 1  •  (aX2  4-  bXY  +  cY2)  =  a! X2  4-  b'XY  4-  c'Y2 
and  e  is  injective. 

This  is  a  beautiful  example  in  mathematics  where  two  apparently  unrelated  objects 
turn  out  to  be  equal.  Maybe  the  reader  can  discover  some  more. 
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Roots  are  Not  Contained  in  Cyclotomic 

Fields 


Raj  at  Tandon 

The  square  root  of  any  integer  is  contained  in  a  cyclotomic  field,  i.e.,  an  extension 
field  Q(Cn)  of  Q  generated  by  =  e2/cl^n.  There  is  a  famous  theorem  of  Kronecker 
and  Weber  (see  the  remarks  at  the  ertd)  which  vastly  generalises  this  fact.  In  what 
follows,  if  a i ,  <*2,  •  •  •  >  <*n  are  complex  numbers,  we  denote  by  Q(aq ,  at2,  •  •  • ,  ctn)  the 
smallest  subfield  of  (E  containing  the  a's.  As  in  the  case  of  Fermat’s  last  theorem 
(FLT,  where  xn  +  yn  =  zn  has  integer  solutions  only  in  the  case  n  =  2),  the  surpris¬ 
ing  fact  is  that  other  nth  roots  (other  than  square  roots)  are  never  contained  in  a  cyclo¬ 
tomic  extension.  Of  course,  one  must  exercise  a  little  care.  For  instance  s/A  =  V2  is 
a  square  root  and  hence  contained  in  a  cyclotomic  extension.  The  point  here  is  that 
r\  is  not  a  genuine  fourth  root;  it  is,  in  fact,  a  square  root. 

DEFINITION  1 .  If  a  is  an  integer  greater  than  1  then  the  real  number  \[a  is  said  to  be 
a  genuine  nth  root  if  it  cannot  be  written  in  the  form  yfb  for  some  integer  b  and  some 
m  <  n. 

In  particular,  a  genuine  nth  root  for  n  >  1  is  irrational;  for  if  it  is  rational,  then  it  is 
of  the  form  sfb  for  some  integer  b  with  m  =  1.  We  have  the  following  theorem: 

THEOREM  2.  Let  a  be  any  integer.  Then,  yfa  is  contained  in  a  cyclotomic  field.  If 
yfa  is  a  genuine  nth  root  where  a  is  an  integer  greater  than  1  and  n  an  integer  greater 
than  2,  then  \fa  is  not  contained  in  any  cyclotomic  field. 

The  first  assertion  is  very  well-known  and  is  easy  to  establish.  While  proving  it,  one 
actually  proves  a  stronger  statement  viz., 

Proposition  3.  If  p  is  a  prime,  then  V(— 1  )(p_  1  )/2P  e  Q(fp). 

Observe  that  if  </a  is  genuine  and  a  =  pe^  . . .  perr  is  the  factorisation  of  a  into 
distinct  prime  powers,  then  g.c.d.(cj ,  e2, . . . ,  er,  n)  =  1 .  For,  if  t  =  (e\ ,  e2, . . . ,  er,  n) 
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and  b  =  p f  . . .  prr  ‘ ,  then  -  Vb.  Thus,  if  n  has  an  odd  prime  factor  p, 
in  order  to  show  that  yfa  is  not  contained  in  any  cyclotomic  extension  it  suffices  to 
show  that  ( \[a)n!p  -  4fa  is  not  contained  in  a  cyclotomic  extension.  On  the  other 
hand,  if  n  —  2r ,  r  >  2,  it  suffices  to  show  that  (\/a)'!/4  =  yfa  is  not  contained  in  any 
cyclotomic  extension,  i.e.,  it  suffices  (as  in  the  case  of  FLT)  to  prove  our  theorem  for 
n  =  4  or  p  where  p  is  any  odd  prime. 

The  proof  follows  from  the  following  propositions  which  can  be  found  in  any 
standard  text  on  Galois  theory  (see,  for  instance  [1]).  We  will  also  refer  to  the  article 
[2]  on  Galois  theory  by  B  Sury  which  appeared  in  Resonance.  In  what  follows,  K 
and  F  will  always  denote  subfields  of  (C,  and  if  K  is  any  such  field  we  denote  by 
G(K)  the  group  of  automorphisms  of  K.  [K  :  F]  denotes  the  dimension  of  K  as  a 
vector  space  over  F. 

Proposition  4.  Iffc^cL  then  [L  :  F]  =  [L  :  K][K  :  F]. 

It  is  easy  to  see  that  if  a/’s  form  a  basis  of  K  over  F  and  fif s  form  a  basis  of  L  over 
K ,  then  the  (tiffs  form  a  basis  of  L  over  F . 

% 

PROPOSITION  5.  [F(a)  :  F]  is  equal  to  the  degree  of  the  unique  monic  polynomial 

fa  of  minimal  degree  in  F[X]  satisfied  by  a,  and  this  is  the  same  as  the  degree  of 
any  irreducible  polynomial  in  F[X]  satisfied  by  a. 

(See  lemma  in  [2].)  It  can  easily  be  seen  by  using  the  Euclidean  algorithm  for  polyno¬ 
mials  that  fa  divides  any  polynomial  in  F[X]  that  has  a  as  a  root  and  hence  divides 
any  irreducible  polynomial  g  satisfied  by  a.  Irreducibility  of  g  implies  that  g  =  cfa 
for  some  constant  c  in  F. 

PROPOSITION  6.  The  group  G((Q(Cm))  of  automorphisms  of  the  field  Q(£m)  for  any 
m  >  2  is  abelian;  in  fact,  it  is  isomorphic  to  the  group  of  units  in  the  ring  Z/mZ. 

It  is  clear  that  if  a  is  an  automorphism  of  Q(£m)  then  since  =  1  we  get  <r(£m)m 
=  1  so  o{£m)  is  another  mth  root  of  1.  Since  an  automorphism  of  a  group  preserves 
order  and  o  is  an  automorphism  of  the  multiplicative  group  (Q(£m)  -  (0)),  o{C,m) 
has  order  m  so  <r(£m)  =  Cm  f°r  some  /  coprime  to  m.  We  thus  have  a  map  o  i 
from  G(Q(£m))  to  the  group  of  units  in  Z/mZ.  That  the  map  is  a  homomorphism  is  a 
simple  exercise.  It  is  clear  that  o  is  completely  determined  by  its  action  on  since 
Cm  generates  <Q(fm).  Hence  the  map  is  injective.  That  the  map  is  surjective  follows 
from  Proposition  8. 

PROPOSITION  7.  If  Q  C  F  C  K  where  F  and  K  are  each  generated  over  Q  by  the 
roots  of  some  polynomials  in  Q[AT],  i.e.,  F  and  K  are  splitting  fields  of  polynomi¬ 
als  in  Q[2f]  (see  [2]),  then  G(F)  is  isomorphic  to  G(K)/G(K/F )  where  G(K/F ) 
denotes  the  subgroup  of  G(K)  consisting  of  those  automorphisms  of  K  which  fix  the 
elements  of  F.  Hence  if  G(K)  is  abelian,  so  is  G(F). 

This  follows  easily  if  we  consider  the  restriction  map  from  G(K)  to  G(F).  The  fact 
that  if  cr  e  G(K),  then  o(F)  -  F  follows  from  the  fact  that  F  is  normal  over  Q 


58  Number  Theory 


(refer  [2],  Box  14).  For,  suppose  F  =  Q(ai,  «2»  ■  •  •  where  the  af  s  are  roots 
of  some  polynomial  f{x)  in  Q[W].  Since  /(a,)  =  0  we  have  c(/(a/))  =  0.  But 
<j(/(a/))  =  / (cr(a/)),  so  a(az)  must  be  another  root  of  /,  i.e.,  cr(a;)  =  ary  for  some 
y;  <r  permutes  the  roots  of  /  and  so  cr(F)  =  F. 

PROPOSITION  8.  If  K  is  generated  over  F  by  the  roots  of  some  polynomial  in  F[X] 
and  a,  a'  are  two  roots  in  K  of  an  irreducible  polynomial  in  F[X],  then  there  exists 
an  automorphism  a  in  G(K/ F)  such  that  a(a)  =  a' . 

We  have  an  isomorphism  (just  the  substitution  map)  from  to  F(a)  which  maps 

X  4-  (/)  to  a  and  similarly  an  isomorphism  from  to  F(a')  which  maps  X  +  (/) 

to  a' .  Hence  we  have  an  isomorphism  from  F(a)  to  F(a')  which  maps  a  to  a' .  This 
map  extends  to  an  automorphism  of  K  (see  Proposition  5.2  in  [1]). 


Proposition  9.  If  p  is  an  odd  prime  or  4  and  if  Tfa  is  genuine  with  a  >  1  then 
G{Tfa,  fp)  is  not  abelian. 


If  p  is  an  odd  prime,  Q(Cp)  is  the  field  generated  over  Q  by  the  roots  of  the  polynomial 
1  +  X  +  X2+  •  •  •  +XP~[ .  If  p  is  an  odd  prime  or  4  and  a  >  1,  then  Q(^o,  £p)  is  the 
field  generated  over  Q  by  the  roots  of  Xp  -  a.  Both  these  polynomials  are  irreducible 
over  Q.  Hence 


mcp)  :  Q]  = 


if  p  is  odd 
if  p  -  4. 


and  [(Q(^fl)  :  Q]  =  p.  Hence  by  Proposition  4 


[<Q(^,?p):Q] 


{p(p  -  1)  if  p  is  odd 
8  if  p  =  4. 


It  follows  again ’by  Proposition  4  that  [Q(^,Cp)  :  Q(fp)]  =  p  and  [Q(^a,fp)  : 
Q(^)]  =  P  ~  1  or  2  according  as  p  is  odd  or  4,  respectively.  Hence  by  Proposition  5, 
Xp  -  a  is  irreducible  over  <R(Cp)  and  1  +  X  +  W2+  •  •  •  +JW5-1  is  irreducible  over 
Q(^)  if  p  is  odd  whilst  X2  +  1  is  irreducible  over  Q(^a).  Observe  that  Tfa  and 
<fa£p  are  roots  of  Xp  -  a  and  £p  and  f2  are  roots  of  1  +  X+  •  •  •  +  JW-1  if  p  is  odd 

whereas  C,p  and  <^p  are  roots  of  X2  +  1  if  p  =  4.  Hence  by  Proposition  8  there  exists 
an  automorphism  €  G(Q(^,  fp)  :  Q(?p))  (i.e.  a  fixes  (p)  such  that  a(^)  = 

and  there  exists  are  G(Q(^F,  fp)  :  Q(^))  (i.e.  r  fixes  ^)  such  that  r(Cp)  =  £2  if 
p  is  odd  and  r(£p)  =  £p  if  p  =  4.  Hence, 


ar(^a)  =  a(^a)  = 
whereas  * 

T(T(^)  =  T(^p)  =  T(^)T(Cp)  =  lf  P  1S  °dd 

if  P  =  4. 

In  either  case  ur  ^  rcr  and  G(Q(^a,  fp))  is  not  abelian. 

Observe  that  if  c  Q(fm),  then  C  =  <Q(C[m,„])  where 

[m,n]  is  the  l.c.m.  of  m  and  n.  It  was  contained  in  the  cyclotomic  extension 

Q(?m).  its  group  of  automorphisms  would,  by  Proposition  7,  be  the  quotient  of  the 
abelian  group  G(Q(fm)),  and  hence  abelian. 
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Remarks 

We  have  apparently  proved  the  stronger  result  that  for  a  genuine  pth  root  Ffa  (with  p 
an  odd  prime  and  a  >  1),  the  Galois  extension  field  generated  by  it  is  not  an  abelian 
extension  of  Q.  However,  this  is  not  really  a  stronger  statement.  The  deep  theorem  of 
Kronecker  and  Weber  referred  to  in  the  introduction  says  that  any  abelian  extension 
of  Q  is  contained  in  a  cyclotomic  extension.  The  interesting  question  is  whether  one 
can  similarly  obtain  the  abelian  extensions  of  any  algebraic  number  field  by  adjoining 
special  values  of  transcendental  functions.  For  imaginary  quadratic  fields  Q(V^d), 
this  has  been  solved  using  the  so-called  theory  of  complex  multiplication.  Roughly, 
the  role  of  the  function  e2nix  is  taken  by  the  elliptic  modular  y-function  and  the 
values  are  considered  at  points  of  finite  order  on  the  elliptic  curves  (in  place  of  the 
circle  as  was  in  the  case  of  the  Kronecker- Weber  theorem).  The  general  question  is 
known  as  Kronecker’s  ‘jugendtraum’  (the  german  word  means  ‘dream  of  youth’)  and 
is  still  open.  It  is  one  of  the  famous  ‘Hilbert  problems’  (the  12th  problem).  Hilbert 
writes  in  his  1900  address  at  the  International  Congress  of  Mathematicians  that  the 
extension  of  Kronecker’s  theorem  to  any  algebraic  number  field  seems  to  him  to  be 
of  the  greatest  importance  and  that  he  regards  this  as  one  of  the  most  profound  and 
far-reaching  problems  in  the  theory  of  numbers. 
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Die  ganzen  zahlen  hat  Gott  gemacht 

Polynomials  with  Integer  Values 

B  Sury 

A  quote  attributed  to  the  famous  mathematician  L  Kronecker  is  ‘ Die  ganzen  zahlen 
hat  Gott  gemacht,  alles  andere  ist  menschenwerkl  A  translation  might  be  ‘  God  gave 
us  integers  and  all  else  is  man’s  work.’  All  of  us  are  familiar  already  from  middle 
school  with  the  similarities  between  the  set  of  integers  and  the  set  of  all  polynomials 
in  one  variable.  A  paradigm  of  this  is  the  Euclidean  (division)  algorithm.  However, 
it  requires  an  astute  observer  to  notice  that  one  has  to  deal  with  polynomials  with 
real  or  rational  coefficients  rather  than  just  integer  coefficients  for  a  strict  analogy. 
There  are  also  some  apparent  dissimilarities — for  instance,  there  is  no  notion  among 
integers  corresponding  to  the  derivative  of  a  polynomial.  In  this  discussion,  we  shall 
consider  polynomials  with  integer  coelficients.  Of  course  a  complete  study  of  this 
encompasses  the  whole  subject  of  algebraic  number  theory,  one  might  say.  For  the 
most  of  this  paper  (in  fact,  with  the  exception  of  Lemma  5,  Lemma  7  and  Exercise  3), 
we  adhere  to  fairly  elementary  methods  and  address  a  number  of  rather  natural  ques¬ 
tions.  To  give  a  prelude,  one  such  question  might  be  “if  an  integral  polynomial  takes 
only  values  which  are  perfect  squares,  then  must  it  be  the  square  of  a  polynomial?” 

(X  \  1 

J  =  ~  ^  ~ j- yy n  +  1  ^  takes 

integer  values  at  all  integers  although  it  does  not  have  integer  coefficients.  By  Z,  we 
shall  denote  the  set  of  integers. 


Prime  Values  and  Irreducibility 

The  first  observation  about  polynomials  taking  integral  values  is 


LEMMA  1.  A  polynomial  P  takes  Z  to  Z  if,  and  only  if  P(X)  =  ao  +  a\ 
+  •  •  •  +  a^  (  with  a i  E  Z. 
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PROOF.  The  sufficiency  is  evident.  For  the  converse,  we  first  note  that  any  polyno¬ 
mial  whatsoever  can  be  written  in  this  form  for  some  n  and  some  (possibly  noninte¬ 
gral)  a\ s.  Writing  P  in  this  form  and  assuming  that  P(Z)  c  Z,  we  have 

P(0)  =  €  Z 

P(  1 )  =  <30  T  a  i  £  Z 

P( 2)  =  <3o  +  a\  ^  j  j  -(-  <22  £  Z 
and  so  on.  Inductively,  since  P(w)  £  Z  Vw,  we  get  <3/  £  Z  V/. 

COROLLARY  1.  If  a  polynomial  P  takes  Z  to  Z  and  has  degree  n,  then  n\P(X)  £ 

zixi 

LEMMA  2.  A  nonconstant  integral  polynomial  P(X)  cannot  take  only  prime  values. 

PROOF.  If  all  values  are  composite,  then  there  is  nothing  to  prove.  So  assume  that 
P(a)  =  p  for  some  integer  a  and  prime  p.  Now,  as  P  is  nonconstant, 

lim  | P{a  +  np)\  =  oo. 

n— >00 

So,  for  big  enough  n,  \P(a  -I-  np)\  >  p.  But  P(a  +  np)  =  P(a)  =  0  mod  p,  which 
shows  P(a  +  np)  is  composite. 

REMARK  1.  Infinitely  many  primes  can  occur  as  integral  values  of  a  polynomial. 
For  example,  if  (a,b)  =  1,  then  the  well-known  (but  deep)  Dirichlet’s  theorem  on 
primes  in  progression  shows  that  the  polynomial  aX  -I-  b  takes  infinitely  many  prime 
values.  In  general,  it  may  be  very  difficult  to  decide  whether  a  given  polynomial 
takes  infinitely  many  prime  values.  For  instance,  it  is  not  known  if  AC  -h  1  represents 
infinitely  many  primes.  In  fact,  there  is  no  polynomial  of  degree  >  2  which  is  known 
to  take  infinitely  many  prime  values. 

LEMMA  3.  If  P  is  a  nonconstant,  integral-valued  polynomial,  then  the  number  of 
prime  divisors  of  its  values  {P(m}}  meZ,  is  infinite,  i.e.,  not  all  terms  of  the  sequence 
P( 0),  P(l), . . .  can  be  built  from  finitely  many  primes. 

PROOF.  It  is  clear  from  Corollary  1  above  that  it  is  enough  to  prove  this  for  P(X)  £ 
Z[X],  which  we  will  henceforth  assume.  Now,  P(X)  =  X/Lo  ai^‘ »  where  n  >  L  If 
go  =  0,  then  clearly  P(p)  =  0  mod  p  for  any  prime  p.  If  ao  ^  0,  let  us  consider  for 
any  integer  t  the  polynomial 

n 

P(a0tX)  =  ^aiiaotX)1  =  ao 

i=0 

There  exists  some  prime  number  p  such  that  Q(m)  =  0  mod  p  for  some  m  and  some 
prime  p,  because  Q  can  take  the  values  0,  1,-1  only  at  finitely  many  points.  Since 
Q(m)  =  1  mod  t,  we  have  (p,  t)  =  1.  Then  P(aotm)  =  0  mod  p.  Since  t  was  arbitrary, 
the  set  of  p  arising  in  this  manner  is  infinite. 


1  +  2>tf'-Vr  }  =«o Q(X). 


/=  , 


1 


‘0 
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Remark  2. 

'  (a)  Note  that  it  may  be  possible  to  construct  infinitely  many  terms  of  the  sequence 
{P(m)}m€z  using  only  a  finite  number  of  primes.  For  example,  take  (a,  d) 
-  1  ,a>d>  1.  Since,  by  Euler’s  theorem,  =  1  mod  d,  the  numbers 

a(a<pWn  —  1 ")  . 

- -j - -  €  Z  V  n.  For  the  polynomial  P{X)  —  dX  4-  a,  the  infinitely  many 

values  P{^{acp(<d^n  -  1))  =  a(p^n+^  have  only  prime  factors  coming  from 
primes  dividing  a. 

(b)  In  order  that  the  values  of  an  integral  polynomial  P{X)  be  prime  for  infinitely 
many  integers,  P{X)  must  be  irreducible  over  Z  and  of  content  1.  By  content, 
we  mean  the  greatest  common  divisor  of  the  coefficients. 


Box  1.  Eisenstein’s  Criterion  and  More 

Perhaps  the  only  general  criterion  known  to  check  whether  an  integral  poly¬ 
nomial  of  a  special  kind  is  irreducible  is  due  to  G  Eisenstein,  a  student  of 
Gauss  and  an  outstanding  mathematician,  whom  Gauss  is  said  to  have  rated 
above  himself.  Eisenstein  died  when  he  was  27. 

Let  f  {X)  =  aQ  +  cnX  +  ‘  +  anXn  be  an  integral  polynomial  satisfying  the 

following  property  with  respect  to  some  prime  p.  The  prime  p  divides  ao,  a  \, 

. . . ,  an-  i  but  does  not  divide  an.  Also ,  assume  that  p 2  does  not  divide  ao. 
Then,  f  is  irreducible. 

The  proof  is  indeed  very  simple  high  school  algebra.  Suppose,  if  possible, 
that  f  (X)  =  g{X)h(X)  =  {bo  4-  b\  X  -1-  •  •  •  4-  br Xr){cQ  -E  c\  X  4-  •  •  •  4*  csXs) 
with  r,  s  >  1 .  Comparing  coefficients,  one  has 

ao  =  boco,  a\  =  aob\  +  boa\, . . .  ,an  =  brcs,  r  4-  s  =  n. 

Since  ao  =  boco  =  0  mod  p,  either  bo  =  0  mod  p  or  co  =  0  mod  p. 

To  fix  notations,  we  may  assume  that  bo  =  0  mod  p.  Since  ao  ^  0  mod 
P 2,  we  must  have  co  ^  0  mod  p.  Now  a\  =  boc\  +  b\co  =  b\CQ  mod  p\ 
so  b\  =  0  mod  p.  Proceeding  inductively  in  this  manner,  it  is  clear  that  all 
the  Vs  are  multiples  of  p.  This  is  a  manifest  contradiction  of  the  fact  that 
an  =  brcs  is  not  a  multiple  of  p.  This  finishes  the  proof. 

It  may  be  noted  that  one  may  reverse  the  roles  of  ao  and  an  and  obtain 
another  version  of  the  criterion: 

Let  f  (X )  =  ao  +  a\ X  -f  •  •  •  +  anXn  be  an  integral  polynomial  satisfying 
the  following  property  with  respect  to  some  prime  p.  The  prime  p  divides 
a\ ,  a2,  . . . ,  an  but  does  not  divide  ao.  Also,  assume  that  p 2  does  not  divide 
an.  Then,  f  is  irreducible. 

The  following  generalisation  is  similar  to  prove  and  is  left  as  an  exercise. 
Let  f{X)  =  <30  +  a\X+  •  •  •  4 -anXn  be  an  integral  polynomial  satisfying  the 
following  property  with  respect  to  some  prime  p.  Let  t  be  such  that  the  prime 
p  divides  ao ,  a\ ,  . . . ,  an-t  but  does  not  divide  an.  Also,  assume  that  p~  does 
not  divide  ao.  Then,  f  is  either  irreducible  or  has  a  nonconstant  factor  of 
degree  less  that  t. 
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In  general,  it  is  difficult  to  decide  whether  a  given  integral  polynomial  is  irre¬ 
ducible  or  not.  We  note  that  the  irreducibility  of  P(X)  and  the  condition  that  it  have 
content  1  are  not  sufficient  to  ensure  that  P(X)  takes  infinitely  many  prime  values. 
For  instance,  the  polynomial  Xn  +  1 05 A"  +  12  is  irreducible,  by  Eisenstein’s  cri¬ 
terion  (see  Box  1).  But,  it  cannot  take  any  prime  value  because  it  takes  only  even 
values,  and  it  does  not  take  either  of  the  values  ±2  since  both  Xn  +  105Z  +  10  and 
Xn  -(-  1 05 A"  +  14  are  irreducible,  again  by  Eisenstein’s  criterion. 

LEMMA  4.  Let  a\, . . . ,  an  be  distinct  integers.  Then  P(X)  -  (X-a  1 )  •  •  •  (X—an)—  1 
is  irreducible. 

PROOF.  Suppose,  if  possible,  P(X)  =  f(X)g(X)  with  deg.  /,  deg.  g  <  n.  Evi¬ 
dently,  as  /(a/)g(fl/)  =  -1,  f(ai)  =  -g(ai)  =  ±1  V  1  <  i  <  n.  Now,  f(X)  +  g(X) 
being  a  polynomial  of  degree  <n  which  vanishes  at  the  n  distinct  integers,  a\ , . . . ,  an 
must  be  identically  zero.  This  gives  P(X)  =  -  f(X)2,  but  this  is  impossible  as  can 
be  seen  by  comparing  the  coefficients  of  Xn. 

EXERCISE  1.  Let  n  be  odd  and  a\, . . . ,  an  be  distinct  integers.  Prove  that  (X-a i)  •  •  • 
(X  -  an)  +  1  is  irreducible. 

Let  us  consider  the  following  situation.  Suppose  p  =  an...ao  is  a  prime  number 
expressed  in  the  usual  decimal  system,  i.e.,  p  =  ao  +  lOtfj  -I-  100(32+  •  •  •  +10 nan, 
0  <  <3/  <  9.  Then,  is  the  polynomial  ao  +  a \X+  •  •  •  +anXn  irreducible?  This  is,  in 
fact,  true  and,  more  generally, 

LEMMA  5.  Let  P(X)  €  Z[X]  and  assume  that  there  exists  an  integer  n  such  that 

(i)  the  zeros  of  P  lie  in  the  half  plane  Re  (z)  <  n  — 

(ii)  P(n-  1)5*0, 

(iii)  P(n)  is  a  prime  number. 

Then  P(X)  is  irreducible. 

PROOF.  Suppose,  if  possible  P(X)  =  f(X)g(X)  over  Z.  All  the  zeros  of  f(X) 
also  lie  in  Re(z)  <  n  —  Therefore,  \f(n  -  ^  -  t)\  <  \f(n  —  \  +  0|Vf  >  0.  Since 
f(n  -  1)^0  and  f(n  -  1)  is  integral,  we  have  | f(n  -  1)|  >  1.  Thus  \  f(n)\  > 
\f(n—\)\  >  1 .  A  similar  thing  holding  for  g(X),  we  get  that  P(n)  has  proper  divisors 
f(n),  g(n)  which  contradicts  our  hypothesis. 


Irreducibility  and  Congruence  Modulo  p 

For  an  integral  polynomial  to  take  the  value  zero  at  an  integer  or  even  to  be  reducible, 
it  is  clearly  necessary  that  these  properties  hold  modulo  any  integer  m.  Conversely, 
if  P(X)  has  a  root  modulo  any  integer,  it  must  itself  have  a  root  in  Z.  In  fact,  if 
P(X)  g  Z{X]  has  a  linear  factor  modulo  all  but  finitely  many  prime  numbers,  the 

P(X)  itself  has  a  linear  factor.  This  fact  can  be  proved  only  by  deep  methods,  viz. 

✓ 

using  the  so-called  Cebotarev  density  theorem.  On  the  other  hand,  (see  Lemma  7) 
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it  was  first  observed  by  Hilbert  that  the  reducibility  of  a  polynomial  modulo  every 
integer  is  not  sufficient  to  guarantee  its  reducibility  over  Z.  Regarding  roots  of  a 
polynomial  modulo  a  prime,  there  is  following  general  result  due  to  Lagrange: 

LEMMA  6.  Let  p  be  a  prime  number  and  let  P(X)  e  Z[X]  be  of  degree  n.  Assume 
that  not  all  coefficients  of  P  are  multiples  of  p.  Then  the  number  of  solutions  mod  p 
to  P(X)  =  0  mod  p  is,  at  the  most,  n. 

The  proof  is  obvious  using  the  division  algorithm  over  Z/pAn  fact,  the  general  result 
of  this  kind  (provable  by  the  division  algorithm  again)  is  that  a  nonzero  polynomial 
over  any  field  has  at  the  most  its  degree  number  of  roots. 

Remark  3.  Since  1,2,...,/?-  1  are  solutions  to  Xp~ 1  =  1  mod  /?,  we  have  Xp~ 1 
-1  =  (X  -  l)(2f  -  2)  •  •  •  (X  -  (p  —  1))  mod  p.  For  odd  /?,  putting  X  =  0  gives 
Wilson’s  theorem  that  (/?  -  1)!  =  -1  mod  p. 

Note  that  we  have  observed  earlier  that  any  non-constant  integral  polynomial  has 
a  root  modulo  infinitely  many  primes.  However,  as  first  observed  by  Hilbert,  the 
reducibility  of  a  polynomial  modulo  every  integer  does  not  imply  its  reducibility 
over  Z.  For  example,  we  have  the  following  result: 

LEMMA 7.  Let  /?,  q  be  odd  prime  numbers  such  that  (^)  •=  =  1  and  p  = 

1  mod  8.  Here  (^)  denotes  the  Legendre  symbol  defined  to  be  1  or  -1  according  as 

p  is  a  square  or  not  modulo  q.  Then,  the  polynomial  P(X)  =  ( X 2  -  p  -  q)2  -  \pq  is 
irreducible,  whereas  it  is  reducible  modulo  any  integer. 

Proof. 

P{X)  =  X4-2(p  +  q)X2  +  (p-q)2 

=  (X  -  \fp  -  \fq)(X  +  V /?  +  \Aj)(X  —  yfp  +  \fq)(X  +  VP  ~  V^)- 

Since  y/py  j q ,  y'p  ±  ^[q ,  yfpq  are  all  irrational,  none  of  the  linear  or  quadratic  factors 
of  P(X)  are  in  Z[X],  i.e.,  P(X)  is  irreducible.  Note  that  it  is  enough  to  show  that  a 
factorisation  of  P  exists  modulo  any  prime  power  as  we  can  use  Chinese  reminder 
theorem  to  get  a  factorisation  modulo  a  general  integer. 

Now,  P(X)  can  be  written  in  the  following  ways: 

P(X)  =  X4  -  2(p  +  q)X2  +  0>  -  q)2 
=  (X2+p~  q)2  -  4  pX2 
=  (X2  -  p  +  q)2  -  4qX2 
=  (X2  -  p-q)2  -  4 pq. 

The  second  and  third  equalities  above  show  that  P(X)  is  reducible  modulo  any  pn 
and  any  qn .  Also  since  p  =  1  mod  8,  p  is  a  square  modulo  any  2"  and  the  second 
equality  above  again  shows  that  P{X)  is  the  difference  of  two  squares  modulo  2n, 
and  hence  reducible  mod  2n. 
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If  Q  is  a  prime  ^  2,  p,  q,  let  us  show  now  that  P(X)  is  reducible  modulo  ln  for 
any  n. 

At  least  one  of  ( |) ,  ( |)  and  ( —■ )  is  1  because,  by  the  product  formula  for  Legen¬ 
dre  symbols,  (|)  •  (|)  •  ( y )  =  L  According  as  ( |) ,  ( |)  or  ( ^ )  =  1 ,  the  second, 
third  or  fourth  equality  shows  that  P{X )  is  reducible  mod  Pn  for  any  n. 

We  end  this  section  with  a  result  of  Schur  whose  proof  is  surprising  and  elegant  as 
well.  This  is: 

SCHUR’S  THEOREM.  For  any  n ,  the  truncated  exponential  polynomial  En(X)  = 
n !  ( 1  +  X  +  4r  +  '--  +  7r)  is  irreducible  over  Z. 

Just  for  this  proof,  we  need  some  nontrivial  number  theoretic  facts.  A  reader  unfa¬ 
miliar  with  these  notions  but  who  is  prepared  to  accept  at  face  value  a  couple  of 
results  can  still  appreciate  the  beauty  of  Schur’s  proof.  Here  is  where  we  have  to  take 
recourse  to  some  very  basic  facts  about  prime  decomposition  in  algebraic  number 
fields.  Suppose,  if  possible,  that  En(X )  =  f(X)g(X)  for  some  nonconsant,  irre¬ 
ducible  integral  polynomial  /.  Let  us  write  f(X)  =  ao  -I-  a\X+  •  •  •  +Xr  (evidently, 
we  may  take  the  top  coefficients  of  /  to  be  1).  Start  with  any  (complex)  root  a  of  f 
and  look  at  the  field  K  =  Q(a)  of  all  those  complex  numbers  which  can  be  written 
as  polynomials  in  a  with  coefficients  from  Q.  The  basic  fact  that  we  will  be  using 
(without  proof)  is  that  any  nonzero  ideal  in  ‘the  ring  of  integers  of  K ’  (i.e.,  the  sub¬ 
ring  Ok  of  K  made  up  of  those  elements,  which  satisfy  a  monic  integral  polynomial) 
is  uniquely  a  product  of  nonzero  prime  ideals  and  a  prime  ideal  can  occur  at  the  most 
deg  /  times.  This  is  a  good  replacement  for  K  of  the  usual  unique  factorisation  of 
natural  numbers  into  prime  numbers.  The  proof  also  uses  a  fact  about  prime  numbers 
observed  by  Sylvester  but  is  not  trivial  to  prove. 

SYLVESTER’S  THEOREM.  If  m  >  r,  then  (m  +  l)(m  +  2)  •  •  •  (m  +  r)  has  a  prime  factor 
p  >  r. 

The  special  case  m  -  r  is  known  as  Bertrand’s  postulate. 

PROOF  OF  SCHUR’S  THEOREM.  Now,  the  proof  uses  the  following  fact  which  is  inter¬ 
esting  in  its  own  right: 

Any  prime  dividing  the  constant  term  ao  of  /  is  less  than  the  degree  r  of  /. 

To  see  this,  note  first  that  Ar(a),  the  ‘norm  of  a  (a  name  for  the  product  of  all  the 
roots  of  the  minimal  polynomial  /  of  a),  is  ao  upto  sign.  So,  there  is  a  prime  ideal 
P  of  Ok  such  that  (a)  =  PkI,  (p)  =  P1  J,  where  /,  J  are  indivisible  by  P  and  k, 
l  >  1.  Here,  ( a )  and  (p)  denote,  respectively,  the  ideal  of  Ok  generated  by  a  and  p. 
Since  En(a)  =  0,  we  have 

0  =  n\  +  n\a  +  n\a^ /2  \  H - ha”. 


We  know  that  the  exact  power  of  p  dividing  n\  is 

K  =  [n/p\  +  [n/p1]  +  •  •  • . 
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Thus,  in  Ok,  the  ideal  (/i!)  is  divisible  by  Pllh‘  and  no  higher  power  ot  P.  Similarly, 
for  1  <  i  <  n,  the  ideal  generated  by  n\a'/i\  is  divisible  by  plh»~lh^ki .  Because  of 
the  equality 

—n\  =  n\a  4-  n\a2  /  2!+  •  •  •  +(/*, 

it  follows  that  we  cannot  have  each  lhn  —  Ih /  4-  ki  strictly  bigger  than  lhn,  which  is 
the  exact  power  of  P  dividing  the  left-hand  side.  Therefore,  there  is  some  i  such  that 
-l hi  +  ki  <  0.  Thus, 

/  <  ki  <  l hi  =  l([i/p\  4  [i/p2]  +  •••)<  — ~7- 

P  ~  1 


Thus,  p  -  1  <  /  <  r,  i.e.,  p  <  r.  This  confirms  the  observation. 

To  continue  with  the  proof,  we  may  clearly  assume  that  the  degree  r  of  /  is  at  most 
n/2.  Now,  we  use  Sylvester’s  theorem  to  choose  a  prime  q  >  r  dividing  the  product 
n(n  -  1)  •  ••(«  —  r  4-  1).  Note  that  we  can  use  this  theorem  because  the  smallest  term 
n  —  r  4  1  of  this  r-fold  consecutive  product  is  bigger  than  r  as  r  <  n/2.  Note  also  that 
the  observation  tells  us  that  q  cannot  divide  oq.  Now,  we  shall  write  En(X)  modulo 
the  prime  q.  By  choice,  q  divides  the  coefficients  of  X1  for  0  <  i  <  n  —  r. 

So,  f(X)g(X)  =  Xn  +  +  ■  ■  ■  +  n!(„4"~7i'),  mod  q. 

Write  f{X)  =  a0  +  a\X+---+Xr  and  g(X)  =  b0  +  b\X+  •  •  •  +Xn~r . 


The  above  congruence  gives  aobo  =  0,  aob\  +  a\  bo  =  0  etc.  mod  q  until  the  coeffi¬ 
cient  of  Xn~r  of  f(X)g(X).  As  ao  ^  0  mod  q,  we  get  recursively  (this  is  just  like 
the  proof  of  Eisenstein’s  criterion  -  see  Box  1)  that 


bo  =  b\  =  •  •  •  bn—r  =  0  mod  q. 


This  is  impossible  as  bn-r  =  1.  Thus,  Schur’s  assertion  follows. 


Polynomials  taking  Square  Values 

If  an  integral  polynomial  takes  only  values  which  are  squares,  is  it  true  that  the  poly¬ 
nomial  itself  is  a  square  of  a  polynomial?  In  this  section,  we  will  show  that  this,  and 
more,  is  indeed  true. 

LEMMA  8.  Let  P(X)  be  a  Z-valued  polynomial  which  is  irreducible.  If  P  is  not 
a  constant,  then  there  exist  arbitarily  large  integers  n  such  that  P(n)  =  0  and  P(n) 
^  0  mod  p 2  for  some  prime  p. 

PROOF.  Lirst,  suppose  that  P(X)  e  Z[X].  Since  P  is  irreducible,  P  and  P'  have 
no  common  factors.  Write  / (X)P(X)  4-  g{X)P\X)  =  1  for  some  /,  g  e  Z[X].  By 
Lemma  3  there  is  a  prime  p  such  that  P(n)  =  0  mod  p,  where  n  can  be  as  large  as  we 
want.  So,  P\n)  ^  0  mod  p  as  f(n)P(n)  =  g(n)P'(n)  =  1.  Since  P(n  4 ■  p)  —  P(n) 
=  P\n)  mod  p 2,  either  P{n  4-  p)  or  P(n)  is  ^  0  mod  p2.  To  prove  the  result  for 
general  P,  one  can  replace  P  by  m\P  where  m  -  deg  P. 
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LEMMA  9.  Let  P( X)  be  a  Z-valued  polynomial  such  that  the  zeros  of  smallest 
multiplicity  have  multiplicity  m.  Then,  there  exist  arbitrarily  large  integers  n  such 
that  P(n)  =  0  mod  pm,  P(n )  ^  0  mod  pm+]  for  some  prime  p. 

PROOF.  Let  P\(X), . . . ,  Pr(X)  be  the  distinct  irreducible  factors  of  P(X).  Write 
P(X)  =  P\(X)mi  •  •  •  Pr(X)mr  with  m  =  m\  <  •  •  •  mr.  By  the  above  Lemma,  one 
can  find  arbitrarily  large  n  such  that  for  some  prime  p,  P\  ( n )  =  0  mod  p,  Pi  (n)  ^  0 
mod  p 2  and,  Pi(n)  ^  0  mod  p  for  /  >  1.  Then,  P(n)  =  0  mod  pm  and  ^  0  mod  p"'+1 

COROLLARY  2.  If  P{X)  takes  at  every  integer,  a  value  which  is  the  k th  power  of  an 
integer,  then  P(X)  itself  is  the  k th  power  of  a  polynomial. 

PROOF.  If  P{X)  is  not  an  exact  /cth  power,  then  one  can  write  P(X)  =  / (X)k g{X) 
for  polynomials  f,g  so  that  g{X)  has  a  zero  whose  multiplicity  is  <k.  Once  again, 
we  can  choose  n  and  a  prime  p  such  that  g(n)  =  0  mod  p,  ^  0  mod  pk .  This  contra¬ 
dicts  the  fact  that  P(n)  is  a  k th  power. 

[2]  is  an  excellent  source  of  results  of  this  nature. 


Cyclotomic  Polynomials 

These  were  referred  to  already  in  an  earlier  article  ([1]).  It  was  also  shown  there  that 
one  could  use  these  polynomials  to  prove  the  existence  of  infinitely  many  primes 
congruent  to  1  modulo  n  for  any  n.  For  a  natural  number  d,  recall  that  the  cyclotomic 
polynomial  is  the  irreducible,  monic  polynomial  whose  roots  are  the  primitive 

dt\\  roots  of  unity,  i.e.,  =  [lac^o  d)=\(^  ~  e2Kia^d)-  Note  that  Oi(2Q  = 

X  -  1  and  that  for  a  prime  p,  <&P(X)  =  Xp~l  +  •  •  •  +X  +  1.  Observe  that  for  any 
n>\,X"-\  =  Ud/n®d(X). 

EXERCISE  2.  Prove  that  for  any  d ,  has  integral  coefficients,  and  is  irreducible 

over  Z. 

Factorising  an  integral  polynomial  into  irreducible  factors  is  far  from  easy.  Even 
if  we  know  the  irreducible  factors,  it  might  be  difficult  to  decide  whether  a  given 
polynomial  divides  another  given  one. 

Exercise  3. 

(a)  Given  positive  integers  a\<---<an,  consider  the  polynomials  P(X) 

=  Yli>j(Xa‘~aj  -  1)  and  Q(X)  =  -  1).  By  factorising  into  cyclo¬ 

tomic  polynomials,  prove  that  Q(X)  divides  P(X).  Conclude  that  Yii>j  d  -  y ’ 
is  always  an  integer. 

(b)  Consider  the  n  x  n  matrix  A  whose  (/,  y')th  entry  is  the  Gaussian  polynomial 

a; 

J  ~  IJ 

Compute  det  A  to  obtain  part  (a)  again. 
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Here,  for  m  >  r,  the  Gaussian  polynomial  is  defined  as 


Note  that 


m 

r 

m 

r 


(, Xm  -  l)(Xm-'  -!)■■•  ( Xm-r+l  -  1) 

-  i)(Arr-1  -  !)•■  -  i) 


m  —  1 
r- 1 


+  Xr 


It  seems  from  looking  at  <&p(X)  for  prime  p  as  though  the  coefficients  of  the  cyclo 
tomic  polynomials  Od(X)  for  any  d  are  all  0,  1  or  -1.  However,  the  following  rather 
amazing  fact  was  discovered  by  Schur.  His  proof  uses  a  consequence  of  a  deep  result 
about  prime  numbers  known  as  the  prime  number  theorem.  The  prime  number  the¬ 
orem  tells  us  that  /r(x)  -  x/log(x)  as  x  oo.  Here  x(x)  denotes  the  number  of 
primes  until  x.  The  reader  does  not  need  to  be  familiar  with  the  prime  number  theo¬ 
rem  but  is  urged  to  take  on  faith  the  consequence  of  it  that  for  any  constant  c,  there 
is  n  such  that  k{ 2n)  >  cn. 


Proposition  1.  Every  integer  occurs  as  a  coefficient  of  some  cyclotomic  polyno¬ 
mial. 

PROOF.  First,  we  claim  that  for  any  integer  t  >  2,  there  are  primes  p\  <  P2  <  •  •  •  <  Pt 
such  that  p\  +  p2  >  Pt -  Suppose  this  is  not  true.  Then,  for  some  t  >  2,  eveiy  set  of  t 
primes  p\  <  •  •  •  < Pt  satisfies  p\  -F  p2  <  Pt-  So,  2 p\  <  pt-  Therefore,  the  numbei  of 
primes  between  2k  and  2k+l  for  any  k  is  less  than  t.  So,  n(2k)  <  kt.  This  contra¬ 
dicts  the  prime  number  theorem  as  noted  above.  Hence,  it  is  indeed  true  that  for  any 
integer  t  >  2,  there  are  primes  p\  <  p2  <  •  •  •  <  Pt  such  that  p\  +  P2  >  Pt- 

Now,  let  us  fix  any  odd  t  >  2.  We  shall  demonstrate  that  both  -t  +  1  and  -t  +  2 
occur  as  coefficents.  This  will  prove  that  all  negative  integers  occur  as  coefficients. 
Then,  using  the  fact  that  for  an  odd  m  >  1,  <E>2m(20  =  Om(-X),  we  can  conclude 
that  all  integers  are  coefficients. 

Consider  now  primes  p\  <  P2  <  •  •  •  < Pt  such  that  p\  +  P2  >  Pt-  Write  pt  =  p 
for  simplicity.  Let  n  =  p\  •  •  •  pt  and  let  us  write  modulo  .  Since  Xn  —  1 

=  Ud/n  ®d(X),  and  since  p\  +  p2  >  Pt,  we  have 

\  -  XP1 

®„(X)  =  []  F— —  s  (i+  •  •  •  tX0(i  -  x”')  ■  ■  ■  (i  -  x<”) 

/=1 

=  (\+---+Xp)(\-XPl - X^Omod  Xp+l. 

Therefore,  the  coefficients  of  Xp  and  Xp~ 2  are  1  -  (  and  2  —  r,  respectively.  This 
completes  the  proof.  Note  that  in  the  proof  we  have  used  the  fact  that  if  P(X) 
=  (1  -  Xr)Q(X)  for  a  polynomial  Q(X),  then  Q(X)  =  P{X){  1  +  Xr  +  X2'  +  •  •  • 
+  •  •  •)  modulo  any  Xk . 


Exercise  4. 

(a)  Let  A  =  (a/;)  be  a  matrix  in  GL{n,Z),  i.e.,  both  A  and  A-1  have  inte¬ 
ger  entries.  Consider  the  polynomials  pi(X)  -  £j=q  aijX2  for  0  <  i  <  n. 
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Prove  that  any  integral  polynomial  of  degree  at  most  n  is  an  integral  linear 
combination  of  the  pfX).  In  particular,  if  ao,. . . ,  an  e  Q  are  distinct,  show  that 
any  rational  polynomial  of  degree  at  most  n  is  of  the  form  £”_0  XfX  +  aj)n 
for  some  A,  e  Q. 


(b)  Prove  that  1  +  X+---+X"  =  ^(-O'Y "  .  +  X)n~2i .  Conclude 

1  +  V5  i-V5 


that  X, 


n  —  i 


>0 


^  ■,  where  a  =  i-~2y  -■ ,  ft  = 


This  is  known 


as  Binet’s  formula.  Further,  compute  £/> q( — 1 )/ 


n  —  i 


Remrak  4.  It  is  easily  seen  by  induction  that  2/>o  \  i  1  )  ls  Just  ^ n  +  Uth 

Fibonacci  number  Fn+\. 

As  remarked  earlier,  even  for  a  polynomial  of  degree  2  (like  X2  +  1)  it  is  unknown 
whether  it  takes  infinitely  many  prime  values.  A  general  conjecture  (Bouniakowsky, 
Schinzel  and  Sierpinski)  in  this  context  is: 

A  nonconstant  irreducible  integral  polynomial  whose  coefficients  have 
no  nontrivial  common  factor  always  takes  on  a  prime  value. 

We  end  with  an  open  question  which  is  typical  of  many  number-theoretic  ques¬ 
tions — a  statement  which  can  be  understood  by  the  proverbial  layman  but  an  answer 
which  proves  elusive  to  this  day  to  professional  mathematicians.  For. any  irreducible, 
monic,  integral  polynomial  P(X),  define  its  Mahler  measure  to  be  M(P ) 
=  Y\i  Max(|a/|,  1),  where  the  product  is  over  the  roots  of  P.  The  following  is  an 
easy  exercise. 

EXERCISE  5.  M{P )  =  1  if  and  only  if  P  is  cyclotomic. 

D  H  Lehmer  posed  the  following  question: 

Does  there  exist  C  >  0  such  that  M(P)  >  1  +  C  for  all  noncyclotomic 
( irreducible )  polynomials  P? 

This  is  expected  to  have  an  affirmative  answer  and,  indeed,  Lehmer’s  calculations 
indicate  that  the  smallest  possible  value  of  M(P)  7^  1  is  1.176280821...,  which 
occurs  for  the  polynomial 

P(X)  =  x'°  +  X9  -  X1  -  X6  -  X5  -  X4  -  X3  +  X  +  i. 

Lehmer’s  question  can  be  formulated  in  terms  of  discrete  subgroups  of  Lie  groups. 
One  may  not  be  able  to  predict  when  it  can  be  answered  but  it  is  more  or  less  certain 
that  one  will  need  tools  involving  deep  mathematics. 
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Prime  Representing  Quadratics 


N  V  Tejaswi 


This  chapter  gives  a  proof  of  the  result  contained  in  the  remark  by  R  Tandon  in 
Chapter  9.  In  fact,  the  converse  of  that  statement  is  also  true.  This  result  was  proved 
by  Rabinowitz  and  Frobenius  around  1912.  A  much  simpler  proof  was  given  by 
Ayoub  and  Chowla  in  the  Journal  of  Number  Theory  13,  443-445  (1981).  We  believe 
the  proof  given  here  is  new  and  is  simpler  than  any  of  the  available  ones1 . 

We  use  the  notation  contained  in  Chapter  9  with  the  additional  observation  that 
equivalent  forms  represent  the  same  set  of  numbers.  Let  p  be  a  prime  with  p  =  3 
(mod  4)  and  n  =  ( p  +  l)/4.  We  have: 

THEOREM  1.  The  class  number  of  forms  with  discriminant  —p  is  1,  i.e.,  h{-p)  =  1, 
if  and  only  if  for  each  x,  0  <  x  <  n  —  1,  x2  +  x  +  n  is  a  prime  number. 


PROOF.  Suppose  there  exists  an  integer  b,  0  <  b  <  n  -  1  =  (p  —  3)/4  such  that 
b2  +  b  +  n  is  not  a  prime.  Then  there  is  a  prime  q  such  that 

b2  +  b  +  n  -  aq, 


with  q2  <  b2  +  b  +  (p  +  1  )/4.  We  have 


Aq2  <  (2b  +  l)2  +p  <  (  2<P,  3>  +  1  )  +p 


P+  1 


? 


i.e., 


and 


Q  < 


P+1 
4  ’ 


Aaq  =  (2  b  +  \)z  +  p. 


1  After  this  article  had  been  submitted  to  Resonance ,  I  came  to  know  that  Frobenius  proof  closely  resem¬ 

bles  this  proof. 
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Consider  the  quadratic  forms 

fix,  y)  =  x2  +  xy  +  ny1  and  g(x,  y)  =  ax2  +  (2b  +  1  )xy  +  qy2. 

Both  have  discriminant  equal  to  —p.  Since  the  class  number  is  1 ,  both  these  forms 
should  be  equivalent,  and  hence  should  represent  the  same  set  ot  integers.  Clearly,  q 
is  representable  by  g(x,  y),  (take  x  =  0,  y  -  1 ).  But  q  is  not  representable  by  fix,  y). 
This  follows  from  y  f  0,  for,  if  y  =  0  then  q  would  be  a  square,  and  for  y  f  0  we  have 

f{x,  y)  =  ^(( 2x  +  y)2  +  py2)  > 

while  q  <  (p  4-  l)/4. 

For  the  converse,  suppose  that  h(-p )  >  2;  note  that  p  >  7  since  h{— 3)  = 
/?( — 7)  =  1.  Then  there  exists  a  reduced  form 

g(x,  y)  =  ax 2  +  bxy  +  cy 2 

with  discriminant  —p  which  is  not  equivalent  to  the  (reduced)  form 

fix,  y)  =  x2  +  xy  +  ny2. 

From  the  definition  of  reduced  quadratic  forms  we  have  that  a,c  >  1  and  \b\  < 
a  <  yjp/ 3.  Further  note  that  b  is  odd  and  hence  b  =  2b'  +  1  for  some  integer  b' . 
Clearly,  b'  <  n  -  1  (as  p  >  7)  and  we  have 

b'2  +  b'  +  n  —  ac, 

which  shows  that  for  x  =  b' ,  x2  +  x  +  n  is  not  a  prime  number,  thereby  proving  the 
converse. 

The  following  problem  appeared  in  the  26th  International  Mathematical  Olympiad 
in  1986. 

PROBLEM.  Let  n  be  a  natural  number.  If  k2  +  k  +  n  is  a  prime  number  for  0  <  k  < 
[y/n/3]  show  that  k2  +  k  +  /i  is  a  prime  for  0  <  k  <  n  —  2. 

In  view  of  this  we  can  restate  the  above  theorem  as 

THEOREM  1'.  The  class  number  of  forms  with  discriminant  —p  is  1,  i.e.,  h(—p)  =  1, 
if  and  only  if  for  each  x,  0  <  x  <  [yjn/ 3],  x2  +  x  +  n  is  a  prime  number. 

Acknowledgement:  I  thank  C  S  Yogananda  for  suggesting  the  problem  and  for 
pointing  out  some  errors  in  the  first  draft  of  this  chapter. 
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The  Congruent  Number  Problem 


V  Chandrasekar 

In  Mathematics,  especially  number  theory,  one  often  comes  across  problems 
which  arise  naturally  and  are  easy  to  pose,  but  whose  solutions  require  very 
sophisticated  methods.  What  is  known  as  The  Congruent  Number  Problem’  is 
one  such.  Its  statement  is  very  simple  and  the  problem  dates  back  to  antiq¬ 
uity,  but  it  was  only  recently  that  a  breakthrough  was  made,  thanks  to  current 
developments  in  the  Arithmetic  of  elliptic  curves,  an  area  of  intense  research  in 
number  theory. 


Introduction 

A  positive  integer  n  is  called  a  congruent  number  if  there  exists  a  right-angled  trian¬ 
gle  whose  sides  are  rational  numbers  and  whose  area  is  the  given  number  n. 

If  we  represent  the  sides  of  such  a  triangle  by  X ,  7,  Z,  with  Z  as  the  hypotenuse, 
then  by  our  definition,  a  positive  integer  n  is  a  congruent  number  if  and  only  if  the 
two  equations 

9  9  o  XY 

X2  +  Y2  =  Z2,  -  =  n 

2 

have  a  solution  with  X,  7,  Z  all  rational  numbers. 

Examples. 

1.  Consider  the  right-angled  triangle  with  sides  X  -  3 ,7  =  4  and  Z  =  5.  Its 
area  n  is  XY/ 2  =  6,  so  6  is  a  congruent  number.  Here  we  are  lucky  to  find  a 
suitable  triangle  for  the  number  6  whose  sides  are  actually  integers.  It  will  be 
seen  that  this  is  in  general  an  exceptional  circumstance. 

2.  Consider  the  triangle  with  sides  3/2,  20/3  and  41/6.  This  is  a  right-angled 
triangle  (!)  and  its  area  is  5.  Therefore  5  is  a  congruent  number. 

QUESTION.  Does  there  exist  a  right-angled  triangle  with  integral  sides  and  area 
equal  to  5? 
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Question.  Is  1  a  congruent  number?  (There  is  a  lot  of  history  behind  this  which 
will  be  narrated  below.) 

One  can  generate  congruent  numbers  at  will  by  making  use  of  the  identity 

(X2  -  Y 2)2  +  (2XY)2  =  (X2  +  Y2)2 

which  corresponds  to  the  right-angled  triangle  with  sides  X2  -  Y2,  2XY  and  hypote¬ 
nuse  X2  +  Y2.  We  substitute  our  choice  of  integer  values  for  X  and  Y  and  obtain  the 
congruent  number  n  =  XY(X2-Y2).  For  example,  X  =  3,  Y  =  2  yields  the  triangle 
with  sides  5,  12,  13  and  area  30.  So  30  is  a  congruent  number.  For  more  examples 

refer  to  Box  14.1. 

Now  any  positive  integer  n  can  be  written  as  n  -  u2v,  where  v  has  no  square 
factors  (v  is  a  ‘squarefree  integer’).  It  is  clear  that  n  is  a  congruent  number  if  and 
only  if  v  is  so;  the  right-angled  triangle  for  v  can  be  obtained  from  the  corresponding 
one  for  n,  if  it  exists,  by  scaling  it  down  by  a  factor  of  u.  (Remember  that  we  allow  the 
side  lengths  to  take  fractional  values!)  So  when  deciding  whether  n  is  congruent  or 
not,  we  may  assume  that  n  is  a  squarefree  integer.  This  will  be  done  in  what  follows. 


Box  14.1  Generating  Congruent  Numbers 

Here  p,  q  are  arbitrary  positive  integers  of  opposite  parity  (that  is,  p  +  q  is 
odd),  the  congruent  number  n  is  the  squarefree  part  of  pq(p2  —  q 2),  and  the 
sides  of  the  triangle  are  proportional  to  p2  —  q2 ,  2 pq,  p2  +  q2 . 


Serial  number 

P 

q 

n 

Sides  of  the  triangle 

1 

3 

2 

30 

5,  12,  13 

2 

4 

3 

21 

7/2,  12,  25/2 

3 

5 

4 

5 

3/2,  20/3,41/6 

4 

9 

4 

65 

65/6,  12,  97/6 

5 

25 

16 

41 

40/3,  123/20,  881/6 

Now  we  are  ready  to  formulate: 

The  Congruent  Number  Problem.  Given  a  positive  integer  n,  is  there  a  simple 
criterion  which  enables  us  to  decide  whether  or  not  n  is  congruent? 

A  few  remarks  are  in  order.  To  start  with,  if  we  restrict  the  sides  of  the  triangle  to 
integer  values  only,  the  question  can  be  settled,  at  least  in  theory,  in  a  finite  number 
of  steps.  To  see  why,  recall  the  equations 

X-  +  Y2  =  Z2,  —  =n. 

Since  X  and  Y  are  now  integers,  X  and  Y  both  divide  2 n.  So  to  see  if  a  solution 
exists,  we  let  X  run  through  the  set  of  divisors  of  2 n,  let  Y  =  2 n/X  and  check 
whether  X2  +  Y2  is  a  square  integer.  Thus  the  problem  can  be  settled  in  a  routine 
manner.  For  example,  we  can  easily  verify  that  there  is  no  integral  solution  for  the 
case  n  =  5.  (Note,  however,  that  we  do  know  that  5  is  a  congruent  number.) 
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But  once  we  allow  the  sides  to  have  rational  values,  the  problem  acquires  an 
entirely  different  status.  There  is  no  obvious  starting  point,  unlike  the  case  of  integer 
solutions  discussed  above.  One  could  endlessly  chum  out  congruent  numbers  follow¬ 
ing  the  method  in  Box  14.1  without  being  certain  when  a  given  number  n  (or  n  x  m2, 
for  some  integer  m)  will  appear  on  the  list.  Continuing  in  this  way  would  exhaust 
one  s  computing  resources,  not  to  mention  one’s  patience!  Also,  this  procedure  is  of 
no  avail  if  n  is  not  a  congruent  number. 

To  appreciate  this  better,  consider  the  following  right-angled  triangle  with  area 
101  which  was  found  by  Bastien  in  1914.  This  triangle  has  sides 

_  711024064578955010000 

“  118171431852779451900’ 

_  3967272806033495003922 

”  118171431852779451900  ’ 

and  hypotenuse 

7_  2  x  2015242462949760001961 
118171431852779451900  ' 

This  is  known  to  be  the  smallest  solution  (in  terms  of  the  sizes  of  the  numerator  and 
denominator)  corresponding  to  the  congruent  number  101 !  The  serial  number  of  this 
triangle  in  the  list  in  Box  14.1  would  exceed  1020! 

The  above  considerations  force  us  to  look  for  a  more  indirect  approach  in  our 
search  for  a  criterion  for  characterizing  congruent  numbers. 

Here  we  have  yet  another  instance  of  a  problem  in  number  theory  which  is  simple 
to  state,  yet  has  hidden  depths.  There  have  been  instances  when  the  solutions  of  such 
problems  have  emerged  only  centuries  after  being  posed.  In  such  instances,  a  lot  of 
deep  and  beautiful  mathematics  gets  generated  as  a  result.  A  striking  example  from 
recent  times  is  the  proof  of  Fermat’s  last  theorem  by  Andrew  Wiles  in  1995,  which 
uses  a  mind-boggling  variety  of  techniques  from  several  fields  in  current  mathemat¬ 
ical  research. 

We  shall  see  how  the  congruent  number  problem  falls  into  this  category  by  giving 
a  brief  account  of  its  history  and  the  concepts  and  techniques  that  were  used  in  the 
solution  of  this  problem  which  is  deceptively  so  simple  to  state. 


Brief  History 

The  congruent  number  problem  makes  its  earliest  appearance  in  an  Arab  manuscript 
traced  to  the  tenth  century  (c  972  AD).  In  his  classic  History  of  the  Theory  of  Num¬ 
bers,  Vol  2  (Diophantine  Analysis),  Dickson  quotes  Woepeck’s  view  that  there  is  no 
indication  that  the  Arabs  knew  Diophantus  prior  to  the  translation  by  Aboul  Waft 
(998  AD),  but  they  may  well  have  come  across  the  problem  from  the  Hindus  who 
were  already  acquainted  with  his  work.  The  Arabs  figured  out  that  the  following 
numbers  are  congruent:  5,  6,  14,  15,  21,  30,  34,  65,  70,  1 10,  154,  190  and  so  on.  In 
fact,  their  list  contains  ten  congruent  numbers  greater  than  100,  for  example,  10374. 

The  scene  later  shifts  to  Pisa,  where  Leonardo  Pisano  (better  known  as  Fibonacci), 
by  virtue  of  his  position  as  a  mathematical  expert  in  his  native  city,  is  presented  to 
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the  Emperor  Frederic  II.  The  king’s  scholars  challenge  him  to  find  three  rational 
numbers  whose  squares  form  an  arithmetic  progression  with  common  difference  5. 
This  is  equivalent  to  finding  integers  Z,  Y,  Z,  T,  with  7/0,  such  that  Y 2  -  X 2 
=  Z2  -  Y2  =  5 T2,  and  this  in  turn  reduces  to  finding  a  right  triangle  with  rational 
sides 

Z  +  Z  Z-Z  2  Y 

rj*t  5  rj~*  5  rj-i  * 

and  area  5;  in  other  words,  to  the  question  of  whether  5  is  a  congruent  number  or  not. 
Leonardo  addressed  the  general  problem  in  his  memoir  Liber  Quadratorum  (1225), 
which  was  lost  to  the  world  till  it  was  found  and  published  by  Prince  Boncom- 
paign  in  the  year  1856.  In  addition  to  showing  that  5  and  7  are  congruent  numbers 
(the  triangles  have  sides  3/2,20/3,41/6  and  35/12,24/5,337/60  respectively),  he 
also  states  without  proof  that  no  congruent  number  can  be  a  square,  or  equivalently 
that  1  is  not  a  congruent  number. 

The  proof  of  this  statement  had  to  wait  for  four  centuries.  Eventually  it  led  to 
Fermat’s  discovery  of  his  method  of  infinite  descent,  which  was  to  have  a  profound 
effect  on  subsequent  developments  in  arithmetic,  or  number  theory  as  we  now  call  it. 

Fermat  had  been  in  correspondence  with  many  of  his  contemporaries  regarding  the 
existence  of  a  right-angled  triangle  with  rational  sides  and  a  square  area.  An  explicit 
reference  to  the  application  of  his  technique  to  prove  that  this  is  impossible  appears 
in  his  letter  to  Huygens  in  1659,  where  he  states:  “As  ordinary  methods ,  such  as  are 
found  in  the  books,  are  inadequate  to  proving  such  difficult  propositions,  I  discovered 
at  last  a  most  singular  method  . . .  which  I  call  the  infinite  descent.  At  first  I  used 
it  only  to  prove  negative  assertions  such  as  .. .  ” there  is  no  right  angled  triangle 
in  numbers  whose  area  is  a  square”.  To  apply  it  to  affirmative  questions  is  much 
harder,  so  that,  when  I  had  to  prove  that  ” Every  prime  of  the  form  4/7+1  is  a  sum 
of  two  squares”,  I  found  myself  in  a  sorry  plight.  But  at  last  such  questions  proved 
amenable  to  my  method .”  (We  infer  that  the  technique  of  infinite  descent  had  its  first 
application  in  number  theory  to  the  problem  of  congruent  numbers.)  Continuing, 
Fermat  gives  a  cryptic  description  of  his  method:  “If  the  area  of  such  a  triangle  were 
a  square,  then  there  would  also  be  a  smaller  one  with  the  same  property,  and  so 
on,  which  is  impossible,  . . .  ”.  He  adds  that  to  explain  how  his  method  works  would 
make  his  discourse  too  long,  as  the  whole  mystery  of  his  method  lay  there.  To  quote 
Weil,  “Fortunately,  just  for  once,  he  ( Fermat )  had  found  room  for  this  mystery  in  the 
margin  of  the  very  last  proposition  of  Diophantus” . 

Before  reproducing  Fermat’s  proof  we  prove  the  following: 

Proposition  1.  Let  Z,  Y,  Z  be  the  sides  of  an  integer-sided  right-angled  triangle, 
with  Z  the  hypotenuse,  such  that  Z,  Y,  Z  have  no  common  factors.  Then  there  exist, 
relatively  prime  integers  p,  q  such  that  p  +  q  is  odd,  {X,Y}  =  {p2  -  q2,2pq]  and 
Z  =  p2  +  q2. 

PROOF.  Clearly  Z  and  Y  cannot  be  both  even,  as  they  have  no  common  factors. 
Both  cannot  be  odd,  for  in  this  case  both  Z2  and  Y2  would  be  1  modulo  4,  implying 
that  Z2  =  2  (mod  4);  but  this  is  absurd  as  no  square  is  of  the  form  2  (mod  4).  Thus 
one  of  them,  say  Z,  is  odd  and  the  other,  Y,  is  even.  It  follows  that  Z  is  odd  and  that 
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Z  +  X ,  Z  -  X  are  both  even.  Therefore  (Z  +  X)/2  and  (Z  -  X)/2  are  integers; 
indeed  they  are  coprime,  because  X  and  Z  are  themselves  coprime. 

Since  Y2  =  Z2  —  X 2,  we  obtain: 

/r\2  _  z  +  x  z -x 

V  *2/  __ 2  2 

By  the  unique  factorization  property  of  the  integers,  each  factor  on  the  right-side 
must  be  a  square.  Thus  (Z  +  X)/2  =  p 2,  (Z  —  X)/2  =  g2  with  p  and  <jr  coprime. 
Solving,  we  obtain 

X  =  p2  -  q2,  Y  =  2 pq,  Z  =  p2  +  q2 . 

Since  Z  is  odd,  p  +  q  is  odd. 


Fermat’s  Legacy 

We  now  reproduce  Fermat’s  proof  by  the  method  of  descent  in  the  following: 
THEOREM.  1  is  not  a  congruent  number. 

PROOF.  Suppose,  on  the  contrary,  that  1  is  a  congruent  number;  i.e.,  there  exists  a 
right-angled  triangle  with  integral  sides  whose  area  is  a  square  integer.  In  view  of 
Proposition  1,  its  sides  must  be  of  the  form  2 pq,p2  -  q2,p 2  +  q2  with  p  >  q  >  0, 
p  +  q  odd  and  (p,  q)  —  1 . 

Since  the  area  ( =  pq(p  -  q){p  +  q))  is  a  square  integer  and  the  numbers  p , 
q,  p  -  q,  p  +  q  are  mutually  coprime,  it  follows  that  each  of  these  numbers  is  a 
square  integer.  We  write 

p  =  x2,  q  =  y2,  p  +  q  =  u2,  p  -  q  =  v2 . 

Since  u  and  v  are  odd  and  coprime,  it  follows  that  the  gcd  of  u  +  v  and  u-v  is  2.  But 
now  we  have 

2y2  =  2q  =  u2  -  v2  =  (u  +  v)(w  -  v). 

Arguing  as  in  Proposition  1,  we  see  that  there  exist  integers  r,s  such  that  (u  +  v, 
m— v)  =  (2r2, 4s2)  or  (u+v,  u-v )  =  (4 r2,  2s2).  The  former  case  leads  to  u  =  r2+2s2, 
v  =  r2  -  2 s2  and  therefore  to 


*2  = 


?  ? 


=  r4  +  4s4. 


Hence  r2,  2s2,  x  are  the  sides  of  a  right-angled  triangle  with  area  ( rs )2  and  hypotenuse 
x  =  y/p  <  p2  +  q2  (the  hypotenuse  of  the  triangle  with  which  we  started).  The  case 
u  +  v  =  4r2,  u  -  v  =  2s2  is  dealt  with  in  a  similar  fashion. 

So,  starting  from  a  right-angled  triangle  with  integral  sides  whose  area  is  a  square 
integer,  we  have  produced  another  triangle  of  the  same  type  with  a  smaller  hypotenuse 
than  the  original  triangle.  Clearly  this  process  can  be  repeated.  But  this  gives  rise  to 
an  infinite  decreasing  sequence  of  positive  integers — a  clear  absurdity.  (This  is  the 
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central  principle  behind  infinite  descent.)  We  are  thus  led  to  a  contradiction  and  we 

conclude  that  1  is  not  a  congruent  number. 

The  non-congruent  nature  of  the  number  1  is  of  special  interest  because  it  shows 
that  there  is  no  non-trivial  solution  to  the  equation  X 4  —  Y 4  =  Z2,  which  in  turn 
implies  Fermat’s  last  theorem  (‘The  equation  Xn  +  Yn  =  Zn  has  no  non-trivial 
solutions  in  integers  for  n  >  2’)  for  the  case  n  =  4! 

In  the  following  two  propositions  we  prove  the  claims  made  above. 

PROPOSITION  2.  A  number  n  is  congruent  if  and  only  if  there  exists  a  rational  num¬ 
ber  a  such  that  cr  4-  n  and  a?  —  n  are  both  squares  of  rational  numbers. 

Proof.  Let  n  be  a  congruent  number  and  let  X,  Y,  Z  be  rational  numbers  satisfying 

?99  XY 

X2  +  Y2  =  Z2,  —  =  n. 

Then  X2  +  Y2  ±  2 XY  =  Z2  ±  An,  so 


So  if  we  take  a  =  Z/2,  then  a  is  rational  and  a2  +  n  and  a2  -  n  are  both  squares  of 
rational  numbers. 

For  the  converse,  let  a  be  a  rational  number  such  that  a2  +  n  and  a2  -  n  are  squares 
of  rational  numbers.  Let 

X  =  \J  a2  +  n  +  \/  a2  -  n,  Y  =  V a2  +  n  +  V 'a2  -  n, 
and  _  _ 

Z  =  Vz2  +  y2  =  VTa2  =  2a. 

Then  X,  Y,  Z  are  the  sides  of  a  right-angled  triangle  with  rational  sides  and  area 
XY/ 2  =  (( a 2  +  n)  —  ( a 2  -  «))/ 2  =  n. 

PROPOSITION  3.  If  there  are  non-zero  integers  X ,  Y,  Z  such  that  X 4  -  Y4  =  Z2, 
then  1  is  a  congruent  number. 

PROOF.  Write  the  equation  in  the  form 

A4  =  Y4  +  Z2. 

Using  Proposition  1,  we  deduce  that  there  exist  integers  p,  q  such  that  X2  =  p2  +  q 2 
and  Y  =  p2  -  q2.  But  this  leads  to 


So  p2 / q2  is  a  rational  number  such  that  p2/q2  +  1  and  p2/q2  -  1  are  squares  of 
rational  numbers.  In  other  words  1  is  a  congruent  number. 

Combining  Propositions  2  and  3  with  the  fact  that  1  is  a  non-congruent  number, 
we  deduce  Fermat’s  last  theorem  for  n  =  4. 
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Before  closing  this  section,  it  is  fitting  to  quote  Weil’s  lavish  praise  of  Fermat  and 
his  justly-famous  method:  “ The  true  breakthrough  came  in  1922  with  MordelVs  cel¬ 
ebrated.  paper;  here,  if  Fermat’s  name  does  not  occur,  the  use  of  the  words  "infinite 
descent"  shows  that  Mordell  was  well  aware  of  his  indebtedness  to  his  remote  pre¬ 
decessor  Since  then  the  theory  of  elliptic  curves,  and  its  generalizations  to  curves 
of  higher  genus  and  to  abelian  varieties,  has  been  one  of  the  main  topics  of  modern 
number  theory.  Fermat’s  name,  and  his  method  of  infinite  descent,  are  indissolubly 
bound  with  it;  they  promise  to  remain  so  in  the  future”. 


Congruent  Numbers  and  Elliptic  Curves 

Congruent  numbers  continued  to  excite  the  curiosity  of  number  theorists  over  the 
years.  Their  congruence  properties  have  been  investigated  and  tables  of  such  num¬ 
bers  constructed.  Some  classes  of  numbers  have  also  been  identified  as  congruent 
numbers.  To  cite  an  example,  a  result  due  to  Heegner  and  Birch  shows  that  if  n  is  a 
prime  number  of  the  form  5  (mod  8)  or  of  the  form  7  (mod  8)  then  n  is  a  congruent 
number.  (See  Box  14.2) 

But  what  is  ultimately  sought  is  a  simple  and  complete  characterization  of  all  con¬ 
gruent  numbers;  in  other  words,  an  algorithm  which  will  quickly  determine  whether 
a  given  natural  number  n  is  congruent  or  not. 


Box  14.2  Some  Classes  of  Congruent  Numbers 

This  box  displays  some  results  given  in  the  paper  by  K  Feng  [5].  It  charac¬ 
terises  some  classes  of  congruent  and  non-congruent  numbers  in  terms  of 
their  divisibility  properties. 

To  illustrate,  Gross’s  result  states  that  if  an  integer  n  is  squarefree  and  has  at 
most  two  prime  factors  of  the  form  5,  6  or  7  (mod  8),  then  n  is  a  congruent 
number. 

If  p  and  q  are  odd  primes,  then  the  Legendre  symbol  (p/q)  is  1  if  pis  a 
quadratic  residue  modulo  q  (that  is,  if  the  equation  x  =  p  (mod  q )  has  a 
solution),  else  -1. 

In  the  following  account,  n  is  taken  to  be  a  squarefree  integer.  The  symbol 
‘CN’  means  ‘congruent  number’,  while  ‘Non-CN’  means  ‘non-congruent 
number’,  p,  q,  r  denote  distinct  primes  and  p,  refers  to  an  arbitrary  prime 
congruent  to  /  mod  8. 

For  CN 

•  n  =  2p3  (Heegner  1952,  Birch  1968). 

•  n  —  p5,p7  (Stevens  1975). 

•  n  -  puqv  =  5, 6, 7  (mod  8),  0  <  u,  v  <  1  (B  Gross  1985). 

•  n  =  2p3P5,  2p5P7. 

Contcl. . . 
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•  n  -  2/?ip7,  with  {p\/pi)  =  -1  (Monsky  1990). 

•  n  =  2p\p3,  with  (p\/p3)  =  -1- 

For  Non-CN 

•  n  =  P3,p3q3,'2ps,2p5q5  (Genocchi  1855). 

•  n  =  2p,  with  p  =  9  (mod  16)  (Bastien  1913). 

•  n  =  p\p3,  with  (p\/p3)  =  -1  (Lagrange  1974). 

•  n  =  2p\p5,  with  (p\/p5)  =  -1. 

•  n  =  p\P3q\,  with  (p\/p3)  =  (ps/qi)  =  -L 


As  it  happened,  the  search  for  such  an  algorithm  was  made  possible  by  relating 
the  congruent  number  problem  to  the  arithmetic  of  elliptic  curves. 

This  connection  is  established  as  follows.  From  Proposition  2  we  know  that  a 
number  n  is  congruent  means  there  exists  a  rational  square,  say  u2  such  that  ic  +  n 
and  u2  -  n  are  both  rational  squares.  This  implies  that  w4  —  n2  is  a  rational  square, 
say  v2;  or  equivalently  that  w6  -  n2u 2  =  u2v2.  Setting  x  =  u2  and  y  =  uv  we  arrive 
at  the  equation  y  =  x  -  n  x.  Thus  if  n  is  a  congruent  number,  we  obtain  a  rational 
point  (x,  y)  on  the  curve  represented  by  the  equation  y2  =  x3  -  n2x. 

Now  the  curves  corresponding  to  the  equation  y2  =  x3  -  n2x  are  examples  of 
what  are  known  as  elliptic  curves.  The  arithmetic  of  these  curves  has  been  a  central 
topic  of  research  in  Number  Theory  over  the  years.  In  view  of  the  above  connection, 
it  was  natural  to  expect  that  the  results  relating  to  elliptic  curves  would  be  able  to 
settle  the  congruent  number  problem.  This  expectation  was  realized  when  J  Tunnell 
succeeded  in  finding  a  simple  algorithm  for  the  problem.  (See  Box  14.3  for  a  brief 
outline  of  the  logical  steps  involved  in  Tunnell’s  method.) 

Let  the  reader  be  reassured  that  to  apply  the  algorithm  one  does  not  need  to  know 
anything  about  elliptic  curves,  modular  forms,  liftings  or  L-functions  which  are  (to 
name  a  few)  some  of  the  concepts  and  techniques  which  lie  at  the  basis  of  Tunnell’s 
work! 

In  what  follows,  #S  denotes  the  number  of  elements  of  a  set  S. 

TUNNELL’S  THEOREM  (1983).  Let  n  be  a  squarefree  congruent  number  (that  is,  n 
is  the  area  of  a  right-angled  triangle  with  rational  sides).  Define  An,  Bn,Cn,  Dn  as 
follows: 

An  -  #{(x,y,z)  €  Z3  |  n  =  2x2  +  y2  +  32^2), 

Bn  =  #  ( (*,  y,  z)  e  z?  \  n  =  2x2  +  y2  +  8^2 } , 

Cn  —  #  { (x,  y,  z)  £  Z3  |  n  =  8x2  +  2 y2  +  64r ) , 

Dn  =  #{(x,y,z)  e  Z3  |  n  =  8x2  +  2y2  4-  16^2}. 

Then: 

(A)  An  =  Bn/2  if  n  is  odd;  and 

(B)  Cn  =  Dn/2  if  n  is  even. 
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If  the  Birch-Swinnerton  Dyer  conjecture  is  true,  then,  conversely,  these  equalities 
imply  that  n  is  a  congruent  number. 


Box  14.3  Elliptic  Curves  and  the  Congruent  Number  Problem 

For  each  natural  number  n,  let  En  denote  the  elliptic  curve  represented  by 
the  equation  y2  =  x3  -  n2x.  Then  we  have  the  following  correspondence 
between  the  set  of  right-angled  triangles  with  rational  sides  and  area  n  and 
the  set  of  rational  points  on  En.  Let  the  sides  be  A ,  B ,  C  where  A ,  5,  C  are 
rational  and  A  <  B  <  C,  and  let_(x,  y)  be  a  rational  point  on  En  such  that: 
(a)  x  is  the  square  of  a  rational  number,  (b)  the  denominator  of  x  is  even, 
(c)  the  numerator  of  x  has  no  common  factor  with  n.  The  correspondence 
is  given  as  follows: 


(x,±y)  — ► 
(A,£,C)  — ■> 


+  n  -  Vx  -  n,  \/x  +  n  +  Vx  -  n, 
,  (B2-A2)C\ 

8  /' 


It  can  be  shown  by  means  of  the  above  bijection  that  a  number  n  is  congru¬ 
ent  if  and  only  if  there  exist  infinitely  many  rational  solutions  (x,  y)  on  the 
elliptic  curve  En. 

To  each  elliptic  curve  En ,  there  is  associated  an  important  number  L(En), 
which  we  shall  not  attempt  to  define.  It  is  known  (this  is  the  Coates-Wiles 
Theorem)  that  if  En  has  infinitely  many  rational  solutions,  then  L(En)  =  0. 
Combining  this  with  the  remark  in  the  previous  paragraph  we  deduce  the 
following:  If  L(En)  is  not  zero,  then  n  is  a  non-congruent  number. 

The  converse  statement,  namely  that  L(En )  =  0  implies  the  existence  of 
infinitely  many  rational  points  on  En  (in  other  words,  that  L(En )  =  0 
implies  that  n  is  congruent)  would  follow  from  a  famous  conjecture  due  to 
Birch  and  Swinnerton-Dyer.  (This  conjecture  has  been  made  for  all  elliptic 
curves,  not  just  for  the  En  defined  above.) 

Now  Tunnell’s  work  can  be  summarized  in  one  line;  he  has  found  an  expres¬ 
sion  for  L(En)  which  is  of  the  form 


L(En)  = 


C  x  (An  -  Bn/2),  if  n  is  odd, 
C  x  (Cn  -  Dn/ 2),  if  n  is  even. 


Here  C  is  a  non-zero  number,  and  An,  Bn,  Cn,  Dn  are  the  quantities  defined 
in  the  statement  of  Tunnell’s  theorem. 

The  justification  of  Tunnell’s  algorithm  follows  from  the  above  mentioned 
facts. 


Observe  that  Tunnell’s  algorithm  helps  one  to  establish  whether  a  given  number  n 
is  non-congruent. 
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Examples. 

1.  Let  n  =  1;  then  An  =  Bn  =  2,  so  equation  (A)  is  not  valid.  We  conclude  that  1 
is  not  a  congruent  number. 

2.  We  show  similarly  that  2  and  3  are  not  congruent  numbers. 

3.  Let  n  be  squarefree,  odd  and  congruent  to  5  or  7  modulo  8.  Since  2x *-  +  y~  can 
never  be  congruent  to  5  or  7  modulo  8,  both  cardinalities  in  (A)  are  0  and  hence 
the  condition  is  satisfied.  If  the  Birch-Swinnerton  Dyer  conjecture  were  true, 
we  would  be  able  to  conclude  that  any  such  n  is  a  congruent  number.  (There 
is  supportive  argument  for  this  statement  from  the  tables  and  the  vanishing  of 
the  so  called  L-value  of  the  corresponding  elliptic  curve.) 

In  particular,  157  would  be  a  congruent  number.  This  is  in  fact  true.  A  proof  of 
this  fact  is  furnished  by  the  right-angled  triangle  whose  sides  x,  y,  z,  displayed  below, 
were  computed  by  Don  Zagier.  Again,  this  is  the  smallest  solution  for  the  area  157! 
The  sides  are  X,  Y  where 

_  6803298487826435051217540 

^  "  411340519227716149383203  ’ 

_  411340519227716149383203 

~  21666555693714761309610  ’ 
and  the  hypotenuse  is  Z  where 

^  _  224403517704336969924557513090674863160948472041 
“  891233226892885958802553517896716570016480830  ' 

A  natural  question  on  the  part  of  the  reader  would  concern  the  appropriateness  of 
the  word  ‘congruent’  in  the  definition  of  congruent  number.  As  to  that,  one  cannot 
do  better  than  to  quote  Richard  Guy:  “ Congruent  Numbers  are  perhaps  confusingly 
named'. 

But,  after  all,  what’s  there  in  a  name? 
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Fermat’s  Last  Theorem 

A  Theorem  at  Last! 

C  S  Yogananda 

After  more  than  three  centuries  of  effort  by  some  of  the  best  mathematicians, 
Gerhard  Frey,  J-P  Serre,  Ken  Ribet  and  Andrew  Wiles  have  finally  succeeded 
in  proving  Fermat’s  assertion  that  the  equation  Xn  +  Yn  =  Zn  has  no  solutions 
in  non-zero  integers  if  n  >  3.  Each  of  the  four  mathematicians  made  a  decisive 
contribution,  with  Wiles  delivering  the  coup  de  grace.  The  proof,  as  it  finally 
came  to  be,  is  in  some  sense  a  triumph  for  Fermat. 

When  Pierre  de  Fermat  died  in  1665,  he  had  not  published  a  single  mathematical 
work  (except  for  an  anonymous  appendix  to  a  book  written  by  a  colleague).  His 
mathematical  discoveries  were  contained  in  his  correspondence  with  other  mathe¬ 
maticians  of  his  time,  notably,  Pascal,  Frenicle  de  Bessy  and  Father  Mersenne.  He 
also  left  behind  a  few  unpublished  manuscripts  and  marginal  notes  in  the  books  he 
studied.  We  have  to  be  grateful  to  his  son  Samuel  for  whatever  we  know  of  Fermat’s 
work.  Samuel  de  Fermat  went  through  his  father’s  papers  and  books  in  addition  to 
soliciting  letters  written  by  his  father  from  his  correspondents  in  order  to  publish 
them.  Among  Fermat’s  possessions  was  a  copy  of  the  Latin  translation,  by  Bachet, 
of  Diophantus’  Arithmetic  in  which  Fermat  had  made  a  number  of  marginal  notes. 

The  first  work  Samuel  chose  to  publish,  in  1670,  was  a  new  edition  of  Bachet’s 
Diophantus  with  an  appendix  containing  forty  eight  marginal  notes  made  by  Fermat. 
The  second  of  these  notes  appears  alongside  problem  8  in  Book  II  of  Arithmetic: 
“...  given  ci  number  which  is  square,  write  it  as  a  sum  of  two  other  squares”.  Fermat’s 
note  states,  in  Latin,  that  “on  the  other  hand,  it  is  impossible  for  a  cube  to  be  written 
as  a  sum  of  two  cubes  or  a  fourth  power  to  be  written  as  sum  of  two  fourth  powers  or, 
in  general,  for  any  number  which  is  a  power  greater  than  the  second  to  be  written  as 
a  sum  of  two  like  powers.  1  have  a  truly  marvellous  demonstration  of  this  proposition 
which  this  margin  is  too  narrow  to  contain”.  Thus,  it  was  in  1670  that  the  world 
learnt  of  what  has  come  to  be  termed  as  Fermat’s  Last  Theorem  (FLT):  The  equation 

Xn  +  Yn  =  Z" 
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has  no  solutions  in  non-zero  integers  if  n  >  3.  Fermat  himself  had  given  a  proof 
of  this  assertion  for  n  —  4  using  infinite  descent ,  a  method  he  invented,  and  Euler 
proved  the  case,  n  -  3.  Thus,  to  prove  FLT  we  need  to  show  that  Xp  +  Yp  =  Zp  has 
no  solutions  in  non-zero  integers  whenever  p  is  a  prime  greater  than  3  (do  you  see 
why?). 

After  more  than  three  centuries  of  effort  by  some  of  the  best  mathematicians,  Ger¬ 
hard  Frey,  J-P  Serre,  Ken  Ribet  and  Andrew  Wiles  have  finally  succeeded  in  proving 
Fermat’s  assertion,  each  of  them  making  a  decisive  contribution,  with  Wiles  deliver¬ 
ing  the  coup  de  grace.  The  proof,  as  it  finally  came  to  be,  is  in  some  sense  a  triumph 
for  Fermat.  Elliptic  curves  and  infinite  descent  play  significant  roles;  it  was  Fermat 
who  pioneered  the  use  of  elliptic  curves  in  solving  diophantine  equations,  and  it  is  to 
him  that  we  owe  the  method  of  infinite  descent. 


Diophantine  Equations 

The  chief  work  of  Diophantus  of  Alexandria  (c.  250  A.D)  known  to  us  is  the  Arith¬ 
metic,  a  treatise  in  thirteen  books,  or  Elements,  of  which  only  the  first  six  have  sur¬ 
vived.  This  work  consists  of  about  150  problems,  each  of  which  asks  for  the  solution 
of  a  given  set  of  algebraic  equations  in  positive  rational  numbers,  and  so  equations 
for  which  we  seek  integer  (or  rational)  solutions  are  referred  to  as  diophantine  equa¬ 
tions.  The  most  familiar  example  we  know  is  X 2  +  Y2  =  Z2  whose  solutions  are 
Pythagorean  triples ;  (3,  4,  5),  (5,  12,  13)  are  examples  of  such  triples.  If,  instead, 
we  ask  for  solutions  in  integers  of  X2  +  Y2  =  3 Z2,  we  get  an  example  of  a  diophan¬ 
tine  equation  for  which  there  are  no  solutions  in  non-zero  integers.  (To  see  this,  first 
observe  that  we  may  assume  X,  Y,  Z  to  be  pairwise  relatively  prime,  by  cancelling 
common  factors,  if  any;  and  that  any  square  when  divided  by  3  leaves  remainder  0 
or  1.)  In  fact,  it  is  an  interesting  exercise  to  characterize  the  set  of  natural  numbers  m 
for  which  the  equation  X2  +  Y2  =  mZ2  has  no  solutions  in  non-zero  integers. 

To  understand  the  role  of  geometry  in  solving  diophantine  equations,  let  us  con¬ 
sider  the  equation  X2  +  Y2  =  Z2.  Flow  do  we  characterize  all  solutions  (in  integers) 
of  this  equation?  We  could  assume  again  that  X,  Y,  Z  is  a  primitive  solution,  i.e., 
X,Y,Z  are  pairwise  relatively  prime.  Dividing  by  Z2  and  putting  X/Z  =  x  and 
Y/Z  =  y  we  get  x2  +  y2  =  i,  that  is  to  say,  we  get  a  rational  point  (a  point  both 
of  whose  coordinates  are  rational  numbers),  (x,  y),  on  the  unit  circle  centered  at  the 
origin.  Conversely,  a  rational  point  on  the  circle  x2  +  y2  =  1  will  give  us  a  (primitive) 
Pythagorean  triple.  So,  our  problem  reduces  to  finding  all  rational  points  on  the  unit 
circle.  We  do  this  by  drawing  a  line  with  rational  slope  passing  through  the  point 
(-1,  0).  This  line  will  meet  the  circle  at  one  more  point  and  we  claim  that  this  point 
is  also  rational.  I  shall  leave  it  to  you  to  figure  out  why  it  is  so.  (You  need  to  use  the 
fact  that  if  one  root  of  a  quadratic  equation  with  rational  coefficients  is  rational  then 
the  other  root  is  also  rational.)  This  way  we  obtain  all  rational  points  on  the  circle. 
Put  t  =  tan  6/2  in  the  familiar  parametrisation  of  the  circle,  (cos  6,  sin  6).  Then  we 
get  the  well-known  characterisation  of  the  Pythagorean  triples:  if  m  and  n,  m  >  n, 
are  integers  of  opposite  parity  then  the  numbers 
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History  of  FLT 

•  1640,  Fermat  himself  proved  the  case  n  =  4 

•  1770,  Euler  proved  the  case  n  —  3;  (Gauss  also  gave  a  proof). 

•  1823,  Sophie  Germain  proved  the  first  case  of  FLT  —  first  case  of 
FLT  holds  if  there  is  no  solution  for  Xp  +  YP  =  Zp  for  which  p  does 
not  divide  the  product  XY Z  —  for  a  class  of  primes,  Sophie  Germain 
primes  —  primes  p  such  that  2p  +  1  is  also  a  prime. 

•  1825,  Dirichlet,  Legendre  proved  FLT  for  n  =  5. 

•  1832,  Dirichlet  treated  successfully  the  case  n  —  14. 

•  1839,  Lame  proved  the  case  n  =  7. 

•  1 847,  Kummer  proved  FLT  in  the  case  when  the  exponent  is  a  regular 
prime.  But  it  is  not  known,  even  today,  whether  there  are  infinitely 
many  Sophie  Germain  primes  or  regular  primes. 

•  1983,  Faltings  gave  a  proof  of  Mordell’s  conjecture. 

•  1986,  Frey-Ribet-Serre:  Shimura-Taniyama-Weil  conjecture 
implies  FLT. 

•  1994,  Andrew  Wiles:  proof  of  S-T-W  conjecture  for  semistable  ellip¬ 
tic  curves. 


form  a  primitive  Pythagorean  triple  and  every  primitive  Pythagorean  triple  arises  this 
way. 

This  method  can  be  used  to  find  all  rational  points  on  a  conic  section  whose  defin¬ 
ing  equation  has  rational  coefficients,  once  we  are  able  to  find  one  such  point. 


Elliptic  Curves 

Consider  the  following  classical  problems. 

(i)  Find  all  n  such  that  the  sum  of  the  squares  of  the  first  n  natural  numbers  is  a 
square.  That  is,  we  have  to  find  natural  numbers  n  and  m  such  that 

m 2  =  n(n  +  1)(2  n  +  1 )/ 6. 

(ii)  (Diophantus)  Find  three  rational  right  triangles  of  equal  area. 

Let  A  denote  the  area  of  the  right  triangle  with  sides  a(=  p2  -  q 2),  b  (=  2 pq) 
and  c  (=  p2  +  q2)\  thus  A  =  pq  ( p2  -  q2).  Then  if  we  put  x  =  p/q  we  get  a 
rational  point  (p/q,  1  / q2)  on  the  curve 

o  t 

Ay  =  x  -  x. 

Conversely,  if  ( a/b ,  c/d)  is  a  rational  point  on  this  curve  then  the  right  triangle 
with  d(a2  -  b2)/b2c  and  2 ad/bc  as  legs  also  has  area  equal  to  A. 

(iii)  (From  an  Arab  manuscript  dated  before  the  9th  century)  Given  a  natural  num¬ 
ber  n,  find  a  rational  number  u,  such  that  both  a2  +  n  and  u2  -  n  are  squares  (of 
rational  numbers). 
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What  is  elliptic  about  elliptic  curves? 

Ellipses  are  not  elliptic  curves!  Elliptic  curves  are  so  called  because  it  was  in 
connection  with  the  problem  of  computing  arc  lengths  of  ellipses  that  they 
were  first  studied  systematically.  When  we  compute  the  arc  of  a  circle,  we 

have  to  integrate  the  function  1  / \/(l  -  *2),  which  we  do  in  terms  of  sine 
and  cos  functions.  The  trignometric  functions  are  therefore  called  circular 
functions.  Similarly,  to  compute  the  arc  length  of  an  ellipse,  we  have  to 
integrate  functions  of  the  form 

l/VId  -x2)(l  -  Ox2)}. 

This  integral  cannot  be  computed  using  circular  functions  and  mathemati¬ 
cians  worked  on  this  problem  for  many  years  before  Abel  and  Jacobi,  inde¬ 
pendently  introduced  elliptic  functions  to  compute  such  integrals.  Just  as 
sin  and  cos  satisfy  x  +  y  =  1,  the  elliptic  functions  satisfy  an  equation  of 
the  form  y  =  f(x )  where  f(x )  is  a  cubic. 


If  such  a  u  can  be  found  then  n  is  called  a  congruent  number.  A  number  n  being 
congruent  is  equivalent  to  the  existence  of  a  right  triangle  with  rational  sides  and 
area  n  (see  Chapter  12). 

Let  n  be  a  congruent  number  and  let  u  be  such  that  u~  +  n  =  cr  and  w2  -  n  =  b 2. 
Multiplying  the  two  equations  together  we  get 

u 4  -  n2  =  ( ab )2. 

Multiplying  by  ir  throughout  to  get 

6  2  2  (  ,  n2 

u  —  n  u  =  (abu)  . 

Putting  ir  =  x  and  abu  =  y  we  get  a  rational  point  on  the  curve,  E,  defined  by  the 
equation 

2  3  2 

y  =  x  —  n  x. 

EXERCISE.  Conversely,  if  (x,  y)  is  a  rational  point  on  E  such  that  x  is  a  rational 
square  and  has  an  even  denominator,  then  n  (whose  square  appears  as  the  coefficient 
of  x)  is  a  congruent  number. 

In  each  of  the  above  problems,  we  were  led  to  consider  equations  of  the  form 
W  —  fW,  where  f(x)  is  a  cubic  polynomial  in  x  with  rational  coefficients  and 
distinct  roots.  Such  equations  define  elliptic  curves.  We  could  think  of  elliptic  curves 
as  the  set  of  all  rational/real/complex  solutions  of  such  equations.  The  set  of  all 
complex  solutions  of  an  elliptic  curve  can  be  identified  with  the  points  on  a  torus. 
The  figures  below  (Figure  15.1)  show  what  the  real  and  complex  points  on  an  elliptic 
curve  look  like. 

Finding  rational  points  on  an  elliptic  curve  turns  out  to  be  a  difficult  problem 
and  though  many  deep  results  have  been  proved  (one  of  them  by  Andrew  Wiles 
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Figure  15.1  Typical  illustration  depicting  how  the  real/complex  points  on  an  elliptic 
curve  look  like. 

along  with  John  Coates),  a  lot  remains  to  be  done  in  this  area.  The  study  of  elliptic 
curves  is  currently  a  very  active  field  of  research  involving  many  different  areas  of 
mathematics. 

If  we  try  to  imitate  the  method  we  used  for  a  conic  to  get  more  rational  points 
from  one  such  point  we  are  stuck.  This  is  because  generally,  a  line  meets  a  cubic 
curve  at  three  points  and  we  cannot  conclude  that  the  other  points  of  intersection  are 
rational.  That  is,  if  one  root  of  a  cubic  equation  with  rational  coefficients  is  rational, 
the  other  two  roots  could  be  irrational;  they  could  be  conjugate  surds,  for  instance. 
What  is  true  is  that  if  you  draw  the  line  joining  two  rational  points,  then  the  third 
point  where  this  line  meets  the  cubic  will  also  be  a  rational  point.  Thus,  we  can  ‘add’ 
two  rational  points  to  get  a  third  rational  point.  It  turns  out  that  we  could  take  the 
‘point  at  infinity’  as  the  identity  or  the  ‘zero’  element  and  obtain  a  structure  of  a 
group  (in  fact,  a  commutative  group)  on  the  set  of  rational  points  of  an  elliptic  curve 
by  declaring  the  sum  of  three  collinear  points  to  be  zero;  the  inverse  or  ‘negative’ 
of  the  point  (x,  y)  is  the  point  (x,  -y).  Thus,  to  add  two  points  P  and  Q  join  them 
by  a  straight  line,  find  the  third  point  of  intersection  of  the  line  with  the  curve  and 
reflect  it  in  the  x-axis  to  get  a  point,  R,  on  the  curve  which  will  then  be  the  ‘sum’  of 
P  and  Q. 

EXERCISE.  Consider  the  elliptic  curve,  £,  defined  by  the  equation  y  =  axJ  +  bx 
+  cx  +  d.  Obtain  an  expression  for  the  coordinates  X3,  >3  of  the  sum  of  the  two  points 
P  =  (*i , y\ )  and  Q  =  (x2 ,yj)  on  E  in  terms  of  xi,X2,yi,y2- 
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Hint:  If  P  is  not  equal  to  Q,  *3  =  —x\  -  X2  -  ( b/a )  +  (y2  -  yi)2/tf(x2  —  *l)2 
and  if  P  =  Q,  *3  =  -2x\  -  (b/a)  +  (f'(x\ ))2/a(2y\ )2  where  f(x)  denotes  the 
cubic. 

The  structure  of  a  group  on  the  set  of  rational  points  of  an  elliptic  curve  pro¬ 
vides  us  with  a  powerful  tool  to  study  diophantine  equations.  For  instance,  in  prob¬ 
lem  (ii),  if  we  get  one  rational  point  then  we  could  ‘double’  (i.e.,  draw  a  tangent 
at  that  point)  it  to  get  one  more  point  and  then  add  these  two  to  get  yet  another 
point,  and  so  on.  In  fact,  this  is  what  Fermat  used  to  get  more  solutions  to  the  prob¬ 
lem  (even  Diophantus  used  this  procedure  but  he  gave  only  three  rational  points). 
In  the  congruent  number  problem,  it  turns  out  that  the  double  of  any  rational  point 
which  is  not  of  order  2  is  such  that  the  x-coordinate  is  a  square  number  with  an  even 
denominator. 

The  method  we  used  to  show  the  non-existence  of  solutions  of  X  +  Y  =  3Z  by 
showing  that  the  equation  has  no  solutions  modulo  3  is  a  standard  method  we  use  in 
studying  diophantine  equations.  Assume  that  the  equation  has  integer  coefficients  by 
clearing  the  denominators,  if  necessary.  We  reduce  the  equation  modulo  a  prime  p  by 
replacing  the  coefficients  of  the  equation  by  their  remainders  when  divided  by  p  and 
consider  the  set  of  solutions  of  the  reduced  equation  in  the  finite  field  {0, 1,2, ... , 
p  -  1 ).  If,  for  example,  we  find  a  prime  for  which  there  are  no  solutions  for  the 
reduced  equation,  it  follows  immediately  that  the  original  equation  has  no  rational 
roots. 

Consider  an  elliptic  curve  E  defined  by  y 2  =  f(x).  Except  for  a  finite  set  of 
primes  depending  on  the  cubic  / (x),  the  reduced  equation  will  also  define  an  elliptic 
curve.  In  fact,  the  exceptional  set  of  primes  is  precisely  the  set  of  prime  divisors 
of  the  discriminant  of  the  cubic  f(x).  For  a  prime  p  not  dividing  the  discriminant, 
let  Np  denote  the  number  of  points  of  E  modulo  p,  i.e.,  the  number  of  pairs  (x,y), 
with  x,y  in  {0,  1, 2, . . . , p  -  1),  satisfying  the  equation  modulo  p.  Define  integers 
ap  by 

Np  =  p+  l  -  ap. 

These  ap s  could  be  positive  or  negative  and  Hasse  proved  the  following  inequality 
in  1930: 


M  <  2  VP- 

These  numbers  contain  a  lot  of  information  about  the  rational  points  of  the  elliptic 
curve  and  there  are  many  conjectures  concerning  their  properties  among  which  the 
Birch-Swinnerton-Dyer  conjecture  and  the  Shimura-Taniyama-Weil  conjecture  are 
the  most  important. 

The  content  of  the  Shimura-Taniyama-Weil  (S-T-W)  conjecture  is  that  these  ap  s 
are  the  Fourier  coefficients  of  a  cusp  form  (of  weight  2  and  a  certain  level  N).  The 
definition  of  cusp  forms  is  beyond  the  scope  of  this  chapter  and  we  content  our¬ 
selves  by  saying  that  they  are  certain  functions  on  the  upper  half-plane  (please  see 
Suggested  Reading  at  the  end).  Elliptic  curves  for  which  the  ap  s  satisfy  the  S-T-W 
conjecture  are  called  modular  elliptic  curves. 
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Frey  Elliptic  Curve  and  Fermat’s  Last  Theorem 

The  study  of  rational  points  on  higher  degree  curves  witnessed  a  breakthrough  in 
1983  when  Gerd  Faltings  proved  a  conjecture  of  Mordell.  As  a  corollary,  it  stated 
that  the  curve  Xn  +  Yn  =  1  has  only  finitely  many  rational  points  if  n  >  5,  which 
means  that  there  would  be  at  most  finitely  many  solutions  to  the  Fermat  equation 

Xn  +  Yn  =  Zn. 

The  general  feeling  among  mathematicians  following  this  was  one  of  satisfaction 
since  there  was  no  reason  or  heuristic  basis  as  to  why  FLT  should  be  true;  at  most 
finitely  many  solutions  was  good  enough. 

But  FLT  bounced  back  soon  after  in  1985,  when  Gerhard  Frey  linked  a  counter 
example  of  FLT,  if  there  is  one,  with  an  elliptic  curve  which  did  not  seem  to  satisfy 
the  S-T-W  conjecture!  Frey’s  was  a  simple  but  very  ingenious  idea:  if,  for  some 
prime  p  >  3,  there  are  non-zero  integers  w,  v,  w  such  that  up  +  vp  =  wp,  then  consider 
the  elliptic  curve,  now  referred  to  as  the  Frey  curve , 

y2  =  x(x  4-  up)(x  —  vp). 

Thus  for  the  first  time,  FLT  for  any  exponent  was  connected  with  a  cubic  curve 
instead  of  the  higher  degree  curve  which  the  equation  itself  defines. 

Then  things  started  happening  fast  and  in  the  summer  of  1986,  building  on  the 
work  of  Frey  and  Serre,  Ribet  succeeded  in  proving  that  S-T-W  implies  FLT  by 
showing  that  the  Frey  curve  could  not  be  modular.  Now,  FLT  was  not  just  a  curiosity 
but  was  related  to  a  deep  conjecture;  if  it  were  not  true  and  we  had  a  counter  example, 
the  Frey  curve  would  be  sticking  out  like  a  sore  thumb! 

Soon  after  he  heard  of  Ribet’s  result,  Andrew  Wiles  went  to  work  on  the  S-T-W 
conjecture  in  the  late  summer  of  1986.  After  working  hard  on  it  for  seven  years, 
during  which  time  even  his  closest  friends  did  not  get  to  know  what  he  was  up  to, 
Wiles  stunned  the  mathematical  world  by  claiming  that  he  had  proved  the  FLT  by 
proving  a  particular  case  of  the  S-T-W  conjecture,  the  case  of  semi-stable  elliptic 
curves.  He  made  the  announcement  at  the  end  of  a  series  of  lectures  at  the  Isaac 
Newton  Institute  in  Cambridge,  England  on  the  morning  of  Wednesday,  June  23, 
1993.  But  experts  checking  his  proof  found  many  gaps  of  which  he  could  overcome 
all  but  one.  It  is  to  the  credit  of  Wiles  that  he  did  not  let  this  setback  deter  him. 
Rather,  encouraged  and  mathematically  supported  by  his  students  and  close  friends, 
notably  Henri  Darmon,  Fred  Diamond  and  Richard  Taylor,  he  circumvented  the  gap 
in  September  1994.  His  paper,  along  with  another  one  of  his  jointly  with  Richard 
Taylor,  occupies  one  whole  issue  of  the  leading  journal  Annals  of  Mathematics,  142 
(1995).  It  should  be  remarked  that  the  theorem  Wiles  proved  has  a  very  significant 
result  with  far-reaching  consequences  and  FLT  follows  as  a  simple  corollary. 

Apparently,  Fermat’s  favourite  target  for  his  problems  and  challenges  were  the 
English  mathematicians;  after  all,  he  was  French!  Thus,  it  is  fitting  that  his  most 
famous  challenge  has  been  answered  by  Wiles,  an  Englishman,  though  it  took  a 
while  (A  Wiles!)  coming! 
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Some  Unsolved  Problems  in  Number  Theory 

Progress  Made  in  Recent  Times 

K  Ramachandra 

The  beauty  of  the  theory  of  numbers  is  that  it  poses  so  many  simple-looking 
problems,  most  of  which  remain  unsolved  even  today.  Many  of  these  problems 
have  come  down  to  us  from  ancient  times,  indicating  the  age-old 
fascination  that  human  beings  have  felt  for  numbers.  We  list  a  few  of  these 
problems  below,  describing  some  known  results  and  indicating  the  progress 
made  in  recent  times. 


The  Infinitude  of  Primes 

It  is  easy  to  show  that  the  list  2, 3, 5, 7,  11,...  of  primes  does  not  terminate.  The 
biggest  prime  known  explicitly  today  has  more  than  105  digits!  Now  consider  pairs 
of  primes  that  differ  by  2,  for  instance  (3, 5),  (5, 7),  ( 1 1 ,  1 3),  ( 1 7,  19),...  .  These  are 
the  so-called  twin  primes.  It  is  not  known  as  of  today  whether  the  list  of  twin  primes 
terminates  or  not.  It  is  known  that  the  sum  1/3+  1/5  +  1/11  +  1/17  +  ••  •  =  £  \/p 
taken  over  all  primes  p  such  that  p  +  2  is  prime  is  finite  (indeed,  the  sum,  known 
as  Brun’s  constant ,  can  be  computed  to  a  fair  degree  of  accuracy),  but  this  does 
not  prove  that  there  are  only  finitely  many  such  primes.  (It  is  clearly  possible  for  a 
sum  of  infinitely  many  positive  numbers  to  be  finite;  for  instance,  this  happens  with 
the  sets  and  {j,  q;,  g,  yg,  -  }•  Ancient  Greeks  believed  this 

was  impossible.  The  well-known  paradoxes  of  Zeno  are  related  to  this  observation.) 
The  best  that  we  know  today  is  that  the  list  of  pairs  (p,  q )  of  primes  with 

/  1 

0<p-q<c\np  I  c  =  - 

does  not  terminate.  This  is  a  very  deep  result  due  to  H  Maier  of  Germany.  (Actually 
his  constant  c  is  slightly  less  than  1/4.)  We  are  very  far  from  this  result- for,  say, 
c  =  1/100. 
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Another  question  deals  with  the  number  /r(x)  of  primes  p  below  x.  It  was  noticed 

by  Legendre,  Gauss,  Riemann  and  others  that  x(x)  is  roughly  equal  to  x/  In  x;  this  is 

equivalent  to  saying  that  the  nth  prime  is  roughly  equal  to  n  In  n.  Chebychev  showed 

that  there  exist  constants  a,  b  such  that 

x  x 

a -  <  n(x)  <  b- — 

lnx  lnx 

for  all  x.  Using  the  methods  of  complex  variables,  Hadamard  and  de  la  Vallee  Poussin 
proved  independently  in  the  1890’s  that 


lim 

x — ►  oo 


x/  ln(x) 


Instead  of  /r(x),  it  is  nicer  to  deal  with  the  function  Q(x)  which  counts  the  prime  p 
with  the  weight  In  p\  that  is,  Q(x)  =  <x  In  p.  It  was  proved  around  the  turn  of  the 
century  that 


\Q(x)-x\  <x(<>v1"*) 


for  all  x  >  lO100  and  a  certain  absolute  positive  constant  h.  The  precise  value  of  h  is 
not  important.  One  of  the  deepest  results  in  prime  number  theory  is  the  theorem  that 

the  term  can  be  replaced  by 


(In  x)3//5(ln  In  x)-1/5 


This  result  is  due  to  the  Soviet  mathematician  I  M  Vinogradov. 


Additive  Prime  Number  Theory 


In  1 742  Goldbach  asked,  in  a  letter  to  Euler,  whether  every  even  number  from  6 
onwards  can  be  expressed  as  a  sum  of  two  odd  primes.  The  answer  to  this  ques¬ 
tion  is  unknown  even  today!  The  achievements  in  this  problem  have  a  very  long  his¬ 
tory.  Using  the  so-called  ‘circle  method’  pioneered  by  Ramanujan-Hardy,  Hardy  and 
Littlewood  showed  that  if  the  hypothesis  formulated  below  holds  true,  then  every  odd 
number  from  some  point  onwards  can  be  expressed  as  a  sum  of  3  odd  primes. 

The  hypothesis  is  stated  in  terms  of  the  following  function  p  defined  on  the  set  of 
positive  integers: 

(  1,  for  n-  1; 

p(n)  =  <  0,  if  n  is  divisible  by  the  square  of  a  prime; 

l  ( — l)fc,  if  n  is  the  product  of  k  distinct  primes. 


Let  a ,  b  be  positive  integers,  and  let  h  >  3/4  be  a  constant.  The  hypothesis  then  states 
that  the  following  inequality  holds  for  all  x  >  N(a,  b,  h),  where  N  is  some  function 
that  depends  only  on  a ,  b,  h: 


p{an  +  b) 


This  hypothesis  is  open  as  of  today.  It  is  considered  very  difficult  to  prove,  even  in 
the  special  case  a  =  b=l,h=\  -  1CT100. 


Some  Unsolved  Problems  in  Number  Theory  93 


The  Circle  Method 

(The  ‘circle  method’  was  developed  by  Ramanujan  and  Hardy  while  they 
were  working  on  the  partition  problem.  The  problem  is  to  find  an  asymptot¬ 
ically  accurate  formula  for  p(n),  the  number  of  partitions  of  n  or  the  num¬ 
ber  of  ways  that  n  can  be  written  as  an  unordered  sum  of  positive  integers 
(p(  1)  =  1,  p( 2)  =  2,  p{ 3)  =  3,  p( 4)  =  5,  . . . ).  It  has  been  known  from  the 
time  of  Euler  that 


oo  oo 

n  o  ~zj)~i  =  2 

j  = 1  n = 1 

Let  / (^)  denote  the  infinite  product  on  the  left  side.  The  singularities  of 
f(z )  are  the  roots  of  unity  and  lie  densely  on  the  unit  circle  \z\  =  1;  thus 
/ (z)  has  the  unit  circle  as  its  circle  of  convergence.  Using  Cauchy’s  residue 
theorem,  we  obtain 


p(n)  = 


2ni 


O 


\z\=r 


m 

zn+\ 


dz, 


for  0  <  r  <  1 .  Thus  the  problem  of  estimating  p{n)  has  been  converted  into 
one  of  estimating  an  integral.  The  beautiful  and  amazingly  productive  idea 
pioneered  by  Ramanujan  and  Hardy  was  to  estimate  the  integral  by  identify¬ 
ing  the  points  where  ‘most’  of  the  contribution  comes  from;  these  are  clearly 
the  points  on  \z\  =  r  that  lie  ‘close’  to  the  poles  of  / (z).  The  practical  details 
are  formidable,  but  what  is  of  significance  is  that  the  method,  originally  con¬ 
ceived  to  tackle  the  partition  problem,  has  turned  out  to  be  applicable  to  a 
large  class  of  related  problems — for  instance,  Waring’s  problem.) 


However,  in  1937  Vinogradov  proved  the  same  result  without  having  to  use  any 
unproved  hypothesis.  A  recent  result  in  the  direction  of  Goldbach’s  conjecture  is  the 
one  by  O  Ramare:  Every  positive  even  number  can  be  expressed  as  a  sum  of  not  more 
than  6  primes.  Another  result  by  the  author  and  his  colleagues  A  Sankarayanan  and 
K  Srinivas  is  the  following:  let  gn  denote  the  nth  even  number  expressible  as  a  sum 
of  2  odd  primes  (gj  =  6,  g2  =  8,  g3  =  10, . . . ).  We  do  not  know  whether  the  range 
of  g  exhausts  the  even  numbers  beyond  6,  but  the  following  is  now  known: 

(gn+ 1  -  gn)3/  <  kgn  for  all  n, 


where  k  is  a  positive  constant  independent  of  n. 


Waring’s  Problem 

Let  k  be  any  natural  number  greater  than  1 .  More  than  two  centuries  back,  E  Waring 
conjectured  the  following.  Let  g(k)  =  2k  +  [1.5^]  -  2  and  write  g  for  g{k).  Then 
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every  positive  integer  n  can  be  expressed  as  a  sum  of  g  or  fewer  positive  k  pow¬ 
ers;  that  is,  for  all  n  e  N  there  exist  non-negative  integers  x\,  X2, . . . ,  Xg  such  that 
n  =  xk{  +  xi;  +  •  •  •  +  Xg.  It  is  not  too  hard  to  check  that  the  number  q  =  2*  [1.5*]  -  1 
cannot  be  expressed  as  a  sum  of  fewer  than  g  positive  kth  powers;  that  is,  the  equation 

A  +  x2  +  '"+xg-\  =ci 

has  no  solution  in  non-negative  integers  x/.  ( Example :  Let  k  —  3;  then  g  =  8  +  3  —  2 
=  9  and  q  =  (8  x  3)  -  1  =  23.  Since  23  <  33,  to  express  23  as  a  sum  of  positive 
cubes  we  must  use  only  the  summands  1  and  8,  and  since  23  =  (2  x  8)  +  (7  x  1), 
we  require  at  least  9  such  summands.  Thus  23  cannot  be  expressed  as  a  sum  of  fewer 
than  9  positive  cubes.)  Thus  g  is  the  most  economical  number  of  summands. 

The  current  status  of  the  problem  is  as  follows:  There  exists  an  absolute  positive 
constant  C  such  that  Waring’ s  conjecture  is  true  for  all  k  >  C.  The  proof  derives 
from  the  ideas  of  Ramanujan,  Hardy,  Littlewood,  Vinogradov,  Dickson,  Ridout  and 
Mahler  and  is  very  complicated,  running  to  hundreds  of  pages.  It  should  be  men¬ 
tioned  that  the  proof  only  establishes  the  existence  of  C  and  gives  no  clue  as  to  its 
magnitude;  no  C,  however  large,  can  be  calculated  by  the  method  of  proof. 

Problems  on  Irrationality 

Consider  the  zeta  function  £(?)  defined  for  real  numbers  t  >  1  as  follows: 

C(0  =  1  1  !nt •  One  £rand  achievements  of  the  century  is  the  proof  that 

1  1  1 

C(3)  —  1  H — t  3 — :  1 — t  +  •  •  • 

23  33  43 

is  irrational.  (An  irrational  number  is  one  that  is  not  expressible  as  a  ratio  of  two 
non-zero  integers.  Related  to  the  idea  of  irrationality  is  the  notion  of  transcendence. 
A  number  is  algebraic  if  it  is  the  root  of  a  polynomial  with  integral  coefficients;  else 
it  is  transcendental.  Examples  of  algebraic  irrationals  are  V2,  s/2  and  v/T0  +  x^2l, 
and  examples  of  transcendental  numbers  are  n,  e  and  In  2  (here  e  -  2.71828. . .  is 
Euler’s  number).  The  proof  that  a  given  number  is  transcendental  can  be  extremely 
difficult.)  The  proof  is  due  to  R  Apery.  What  happens  when  t  is  an  odd  positive 
integer  greater  than  3  is  open.  Strangely,  a  great  deal  is  known  when  t  is  an  even 
positive  integer.  Indeed,  it  is  known  that  the  value  of  f(r)  is  a  rational  multiple  of  id 
whenever  t  is  an  even  positive  integer.  (This  has  been  known  since  the  time  of  Euler.) 
This  immediately  implies  that  £(r)  is  irrational,  indeed  transcendental,  when  t  is  an 
even  positive  integer.  The  paucity  of  positive  conclusions  for  the  case  when  t  is  an 
odd  positive  integer  is  extremely  curious. 

Much  the  same  can  be  said  for  Euler’s  constant  y  defined  thus: 

y  =  lim  (l  +  -  +  “  +  ***  +  -)--ln«. 

«-*•  oo  y  2  3  n ) 

Amazingly,  it  is  not  known  whether  y  is  rational  or  not. 

The  transcendency  of  numbers  such  as  n  -I-  In  2  was  first  proved  by  A  Baker.  These 
are  deep  results. 
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Concluding  Remarks 


It  appears  that  there  is  no  dearth  of  attractive  problems.  What  is  needed  are  solutions! 
What  has  been  solved  is  very  little  and  what  remains  to  be  solved  is  vast.  In  figurative 
terms,  what  has  been  solved  can  be  likened  to  ap  egg-shell,  and  what  remains  to  be 
solved  to  the  infinite  space  surrounding  it. 


Addendum  to  “Some  Unsolved  Problems  in 

Number  Theory” 

( Resonance ,  May  1997) 

S  S  Pillai — The  omission  of  the  name  S  S  Pillai  (Siva  Sankara- 
narayana  Pillai)  in  connection  with  Waring’s  problem  is  very  seri¬ 
ous.  In  a  series  of  papers,  Pillai  proved  that  if  k  >  6  and  further  if 
(3*  -1-  l)/(2*  -  1)  <  [1.5*]  +  1  then  Waring’s  conjecture  is  cor¬ 
rect  for  that  k.  Around  the  same  time  (but  a  little  later)  L  E  Dick¬ 
son  proved  this  with  k  >7  and  (3*  +  l)/(2*  -  1)  <  [1.5*]  +  1. 
The  inequality  (3*  +  l)/(2*  —  1)  <  [1.5*]  +  1  was  proved  for 
all  integers  exceeding  a  certain  constant  C  (same  C  as  in  the  para¬ 
graph  on  Waring’s  problem)  by  K  Mahler.  The  history  of  this  dis¬ 
covery  is  very  well  explained  in  Introduction  to  the  Theory  of  Num¬ 
bers  by  G  H  Hardy  and  E  M  Wright  (see  notes  at  the  end  of  the 
chapter  XXI).  For  another  treasure  house  of  information  regarding 
priority  of  Pillai’s  work  see  K  Chandrasekharan,  S  S  Pillai  (obitu¬ 
ary),  J .  Indian  Math.  Soc.,  Vol.15,  pp  1-10,  1951.  Regarding  Pillai’s 
achievements  I  mention  the  following:  when  I  was  in  the  Institute 
for  Advanced  Studies,  Princeton,  USA,  during  1970-71,  I  noticed 
in  the  Institute  Library  a  book  by  G  H  Hardy  where  he  places 
Pillai  as  the  greatest  Indian  mathematician  after  Srinivasa  Ramanu¬ 
jan.  Waring’s  conjecture  was  proved  for  k  =  5  by  Chen-Jing- 
Run  (around  1970)  and  for  k  =  4  by  R  Balasubramanian,  J- 
M  Deshouillers  and  F  Dress  in  1989.  Cases  k  =  2  and  3  were 
disposed  off  (by  simpler  methods)  by  Lagrange  and  Wieferich 
respectively.  About  Pillai  I  have  the  following  comment:  Once  I 
was  talking  to  a  responsible  Indian  specialist  dealing  with  His¬ 
tory  of  Mathematics.  I  was  very  surprised  when  I  came  to  know 
that  he  had  not  heard  of  Pillai  at  all.  I  can  account  for  it 
as  follows.  Pillai  was  very  unassuming;  he  was  a  member  of 
the  Indian  Mathematical  Society  alright;  but  he  was  not  a  fel¬ 
low  of  any  of  the  academies  and  he  had  no  publicity  whatso¬ 
ever  amongst  mathematicians  who  had  not  looked  at  the  book  by 
G  H  Hardy  and  E  M  Wright  mentioned  earlier. 

The  equation  I  £  ^( an  +  b)  <  xh  under  the  section  ‘Additive  Prime 


Number  Theory’  should  read 


Z  p(an  +  b) 

1  <n<x 


<  x' 


Cont. . 
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Contd. . . 


3.  A  comment  on  The  Circle  Method  in  the  box  on  page  78: 

The  function  f(z)  is  analytic  in  \z\  <  1  and  it  does  not  exist  anywhere 
in  \z\  >  1.  (So  the  terminology  poles  of  f(z )  is  not  correct).  We  have  to 
make  r  a  suitable  function  of  n  but  still  less  than  1.  Then  decompose  this 
circle  into  small  bits  in  a  particular  way  and  obtain  asymptotics  of  each  bit. 
The  cumulative  effect  of  adding  all  these  asymptotics  will  give  the  Hardy- 
Ramanujan  formula  for  partitions.  Actually  Ramanujan  in  his  first  letter 
(this  letter  was  written  from  the  Madras  Port  Trust)  to  Hardy  mentions  (see 
equation  1.14  of  Twelve  Lectures )  that  the  integer  q(n)  defined  by 


(note  that  LHS  is  the  product 


nsLitci  -**xi  -x2'1-'))-1) 


is  the  integer  nearest  to 


When  questioned  about  this,  he  wrote  in  a  letter  that  it  is  “not  the  integer 
nearest  to  but  this  main  term  plus  . . .  ”.  (Compare  this  main  term  with  the 
first  term  of  the  Hardy-Ramanujan-Rademacher  formula  for  p(n)). 
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